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(57) Abstract. 

The present invention provides polynucleotides that identify and encode two human steroid binding proteins (hSBP). The invention 
provides for genetically engineered expression vectors and host cells comprising the nucleic acid sequences encoding hSBP polypeptides, 
llie invention also provides for the use of substantially purified hSBP polypeptides, antagonists, and nucleotide sequences (e.g., antisense 
sequences) in phannaceutical compositions for the treatment of diseases associated with the expression of hSBP, specifically in the treatment 
of breast cancer. The invention also describes diagnostic assays for the detection of breast cancer in a susceptible or affected patient The 
diagnostic assays utilize compositions comprising the polynucleotides encoding hSBP polypeptides or the complements thereof, which 
hybridize with the genomic sequence or the transcript of polynucleotides encoding hSBP or anti-hSBP antibodies that specifically bind to 
an hSBP polypeptide. 
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HUMAN BREAST TUMOR-SPECIFIC PROTEINS 
TECHNICAL FIELD 

The present invention relates to nucleic acid and amino acid sequences of proteins that are 
differentially expressed in human breast tumor cells and to the use of these sequences in the 
5 diagnosis, study, prevention and treatment of disease. 

BACKGROUND ART 
Development of breast cancer is associated with multiple genetic changes associated with 
alterations in expression of specific genes. Breast cancer tissues express genes that arc not 
expressed, or expressed at lower levels, by normal breast tissue. Thus, it is possible to 
10 differentiate between normal (non-cancerous) breast tissue and cancerous breast tissue by 

analyzing differential gene expression between tissues. In addition, there may be several possible 
alterations that lead to the various possible types of breast cancer. Thus, different types of breast 
tumors (e.g., invasive vs. non-invasive, ductal vs. axillary lymph node) can be differentiable one 
from another by the identification of the differences in genes expressed by different types of 
15 breast tumor tissues (Porter-Jordan ct al. 1994 Hematol Oncol Clin North Am 8:73-100). Breast 
cancer can thus be generally diagnosed by detection of expression of a gene or genes associated 
with breast tumor tissue. Where enough information is available about the differential gene 
expression between various types of breast tumor tissues, the specific type of breast tumor can 
also be diagnosed. 

2 0 Nucleotide and amino acid sequences associated with breast tumors can serve as genetic 

markers of inheritable breast cancer. Genetic changes on chromosome 1 7 are the most frequently 
identified events associated with breast tumors. At least four markers on chromosome 1 7 have 
been identified: p53 on 17pl3.l, regions of loss of heterozygosity (LOH) on 17pl3.3 and 17ql2- 
qten the breast/ovarian cancer locus (BRCA-1) on 17q2K and a fourth breast cancer grov^h 

25 suppressor gene on chromosome 17 (Casey et al. 1993 Hum Molec Genet 2:1921-1927). 

Such genetic markers can also be useful in identifying patients susceptible to breast 
cancer. For example, the genetic marker BRCA-1 has been linked to a susceptibility of 
developing breast and/or ovarian cancer at a young age in a number of families (Hall et al. 1990 
Science 250:1684-1689; Solomon et al. 1991 Cytogenet Cell Genet 58:686-738). The 

30 cumulative risks of developing breast cancer associated with the BRCA-l marker are 50% at 50 
years and 82% at 70 years (Easton et al. 1993 Am J Hum Genet 52:678-701). However, since the 
gene encoding BRCA-1 has not been cloned or sequenced, identification of an individual carrier 
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of BRCA-l is not possible without use of linkage analysis. Linkage analysis is generally not 
feasible in clinical practice since the genetic epidemiology required is tedious, if riot impossible, 
in most cases (Kent et al. 1995 Europ J Surg Oncol 21:240-241). 

The discovery of nucleotide sequences and polypeptides encoding proteins associated 
with breast cancer would satisfy a need in the art by providing new means of diagnosing and 
treating breast cancer. 

DISCLOSURE OF THE INVENTION 

The present invention features two human steroid binding proteins (hereinafter referred to 
individually as hSBPl, and hSBP2. and collectively as hSBP), and the ftill-length nucleotide 
sequences encoding these proteins, which are differentially expressed in . human breast tumor 
tissue. The transcripts encoding these proteins are present in breast tumor tissue. The first 
polypeptide, referred to hereinafter as human steroid binding protein Ci (hSBPl), is 
characterized as having amino acid sequence homology to rat prostatic binding proteins CI and 
C2 (PSC1_RAT and PSC2_RAT' respectively) and nucleotide sequence homology to hamster 
FHG 22 (GI 206441). The second polypeptide, referred to hereinafter as human steroid binding 
protein C2 (hSBP2), is characterized as having identity to human mammaglobin and homology to 
rat prostatic binding protein C3 (GI 206448). Accordingly, the invention features two 
substantially purified himian steroid binding proteins, as shown in amino acid sequences of SEQ 
lDNO:l and SEQ ID NO:3. 

One aspect of the invention features isolated and substantially purified polynucleotides 
that encode hSBP. In a particular aspect, the polynucleotide is the nucleotide sequence of SEQ 
ID NO:2 and SEQ ID N0:4. In addition, the invention features polynucleotide sequences that 
hybridize under stringent conditions to SEQ ID N0:2 and SEQ ID N0:4. 

The invention additionally features nucleic acid sequences encoding hSBP polypeptides, 
oligonucleotides, peptide nucleic acids (PNA), fragments, portions or antisense molecules 
thereof, and expression vectors and host cells comprising polynucleotides that encode hSBP. The 
present invention also relates to antibodies which bind specifically to an hSBP polypeptide, 
pharmaceutical compositions comprising substantially purified hSBP, fragments thereof, or 
antagonists of hSBP, in conjunction with a suitable pharmaceutical carrier, and methods for 
producing hSBP. 

BRIEF DESCRIPTION OF DRAWINGS 

Figure 1 shows the amino acid sequence (SEQ ID NO: 1) and nucleic acid sequence (SEQ 
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ID N0:2) of human steroid binding protein CL hSBPl. The alignment was produced using 
MacDNAsis software (Hitachi Software Engineering Co Ltd, San Bruno> CA). 

Figures 2A and 2B shows the amino acid sequence (SEQ ID N0:3) and nucleic acid 
sequence (SEQ ID N0:4) of human steroid binding protein C2, hSBP2 (MacDNAsis software, 
5 Hitachi Software Engineering Co Ltd). 

Figure 3 shows the northern analysis for the consensus sequence (SEQ ID N0:2) for 
hSBPl (Incyte clone 606491 ). The northern analysis was produced electronically using 
LIFESEQ™ database (Incyte Pharmaceuticals, Palo Alto CA). The abundance data (Abun) 
represent the number of transcripts of the gene of interest in the cDNA library. Percent 
10 abundance is calculated by dividing the number of transcripts of a gene of interest present in a 
cDNA library by the total number of transcripts in the cDNA library. 

Figure 4 shows the northern analysis for the consensus sequence (SEQ ID NO:4) 
(LIFESEQ'^^^ database. Incyte Pharmaceuticals, Palo Alto CA). 

Figure 5 shows the amino acid sequence alignments among hSBPl (606491 ; SEQ ID 
15 N0:1) rat prostatic binding proteins CI and C2 (SEQ ID N0S:5 and 8), and rabbit uteroglobin 
(SEQ ID N0:9), produced using the multisequence alignment program of DNAStar software 
(DNAStar Inc. Madison WI). 

Figure 6 shows the amino acid sequence alignments among hSBP2 (SEQ ID N0:3) 
human mammaglobin (GI 1 199595; SEQ ID NO: 10). and rat prostatic binding protein C3 (GI 
20 206453; SEQ ID N0:12), produced using the multisequence alignment program of DNAStar 
software (DNAStar Inc. Madison WI). 

Figures 7 A and 7B shows the nucleotide sequence alignments between hSBPl (606491; 
SEQ ID N0:2), hamster FHG22 (GI 1045204: SEQ ID N0:7), and rat prostatic binding protein 
CI (GI 206441; SEQ ID NO:6). 
25 Figures 8A and 8B show the nucleotide sequence alignments between hSBP2 (602516; 

SEQ ID N0:4), human mammaglobin (GI 1 199595: SEQ ID NO:l 1), and rat prostatic binding 
protein C3 (GI 206452; SEQ ID NO: 13). 

MODES FOR CARRYING OUT THE INVENTION 

Definitions 

30 "Nucleic acid sequence" as used herein refers to an oligonucleotide, nucleotide or 

polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic 
origin which can be single- or double-stranded, and represent the sense or antisense strand. 
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Similarly, "amino acid sequence" as used herein refers to an oligopeptide, peptide, polypeptide, 
or protein sequence. Where "amino acid sequence" is recited herein to refer to an amino acid 
sequence of a naturally-occurring protein molecule, "amino acid sequence" and like terms (e.g., 
polypeptide, or protein) are not meant to limit the amino acid sequence to the complete, native 
5 amino acid sequence associated with the recited protein molecule. 

"Peptide nucleic acid" as used herein refers to a molecule which comprises an oligomer to 
which an amino acid residue, such as lysine, and an amino group have been added. These small 
molecules, also designated anti-gene agents, stop transcript elongation by binding to their 
complementary (template) strand of nucleic acid (Nielsen PE et al (1993) Anticancer Drug Des 
10 8:53-63). 

As used herein, "SBP" refers to the amino acid sequences of substantially purified steroid 
binding protein obtained from any species, particularly mammalian, including bovine, ovine, 
porcine, murine, equine, and preferably human, from any source whether natural, synthetic, 
semi-synthetic or recombinant. The term "hSBP" as used herein refers to human steroid binding 
15 protein and is meant to encompass hSBPl and hSBP2 polypeptides collectively. 

As used herein, "antigenic amino acid sequence" means an amino acid sequence that, 
either alone or in association with a carrier molecule, can elicit an antibody response in a 
mammal. 

A "variant'' of hSBP is defined as an amino acid sequence that is altered by one or more 
2 0 amino acids. The variant can have "conservative"' changes, wherein a substituted amino acid has 
similar structural or chemical properties, e.g., replacement of leucine with isoleucine. More 
rarely, a variant can have ''nonconservative" changes, e.g., replacement of a glycine with a 
tryptophan. Similar minor variations can also include amino acid deletions or insertions, or both. 
Guidance in determining which and how many amino acid residues may be substituted, inserted 
25 or deleted without abolishing biological or immunological activity can be found using computer 
programs well known in the art, for example, DNAStar software. 

A "deletion" is defined as a change in either amino acid or nucleotide sequence in which 
one or more amino acid or nucleotide residues, respectively, are absent. 

An "insertion" or "addition" is that change in an amino acid or nucleotide sequence which 
30 has resulted in the addition of one or more amino acid or nucleotide residues, respectively, as 
compared to the naturally occurring hSBP. 

A "substitution" results from the replacement of one or more amino acids or nucleotides 
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by different amino acids or nucleotides, respectively, 

The term ''biologically active'' refers to a hSBP having structuraL regulatory, or 
biochemical functions of a naturally occurring hSBP. Likewise, "immunologically active" 
defines the capability of the natural, recombinant or synthetic hSBP, or any oligopeptide thereof, 
5 to induce a specific immune response in appropriate animals or cells and to bind v^ith specific 
antibodies. 

The term "derivative" as used herein refers to the chemical modification of a nucleic acid 
encoding hSBP or the encoded hSBP. Illustrative of such modifications would be replacement of 
hydrogen by an alkyl, acyl, or amino group. A nucleic acid derivative would encode a 

10 polypeptide which retains essential biological characteristics of natural hSBP. 

As used herein, the term ''substantially purified'' refers to molecules, either nucleic or 
amino acid sequences, that are removed from their natural environment, isolated or separated, 
and are at least 60% free, preferably 75% free, and most preferably 90% free from other 
components with which they are naturally associated. 

15 ''Stringency'' typically occurs in a range from about Tm-5°C (5°C below the Tm of the 

probe)to about 20°C to IS'^C below Tm. As will be understood by those of skill in the art, a 
stringency hybridization can be used to identify or detect identical polynucleotide sequences or to 
identify or detect similar or related polynucleotide sequences. 

The term "hybridization" as used herein shall include "any process by which a strand of 

20 nucleic acid joins with a complementary strand through base pairing" (Coombs J (1994) 

Dictionarv of Biotechnology , Stockton Press, New York NY). Amplification as carried out in the 
polymerase chain reaction technologies is described in Dieffenbach CW and GS Dveksler (1995, 
PGR Primer , a Laboratory Manual , Cold Spring Harbor Press, Plainview NY). 
Preferred Embodiments 

25 The present invention relates to hSBP and to the use of hSBP nucleic acid and amino acid 

sequences in the study, diagnosis, prevention and treatment of disease. cDNAs encoding a 
portion of hSBP were predominantly found in cDNA libraries derived firom breast tumor tissue 
(Figures 3 and 4). The abundance data (Abun) reflects the relative level of expression the hSBP 
sequence in the breast, thymus and prostatic cDNA libraries, with the percentage abundance (Pet 

3 0 Abun) representing the percent of total expressed mRN As that are homologous to the hSBP 
sequence. 

The present invention also encompasses hSBP variants. A preferred hSBP variant is one 
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having at least 80% amino acid sequence similarity to an amino acid sequence of an hSBP (i.e., 
an hSBPl amino acid sequence (SEQ ID N0:1) or an hSBP2 amino acid sequence (SEQ ID 
N0:3). A more preferred hSBP variant is one having at least 90% amino acid sequence 
similarity to SEQ ID N0:1 or SEQ ID N0:3. A most preferred hSBP variant is one having at 
5 least 95% amino acid sequence similarity to SEQ ID NO: 1 or SEQ ID N0:3. 

Nucleic acids encoding the human hSBP of the present invention were first identified in 
cDNA, Incyte Clones 606491 and 602615 from breast tumor cell cDNA library BRSTTUTOl 
through a computer-generated search for amino acid sequence alignments. A consensus sequence 
for each of hSBPl (SEQ ID NO:2) and hSBP2 (SEQ ID NO: 4) was derived from the 
10 overlapping and/or extended nucleic acid sequences as shown in the tables below. 

Table 1. Clones from which the consensus sequence (SEQ ID N0:2) of hSBP-Cl was derived. 



Sequence I.D. 


cDNA Library 


Sequence I.D. 


cDNA Librar\' 


Sequence I.D. 


cDNA Library 


419412HI 


BRSTNOTOI 


606371 HI 


BRSTTUTOl 


I21274IH1 


BRSTTUTOl 


603148HI 


BRSTTUTOl 


606491 HI 


BRSTTUTOl 


I2I5122HI 


BRSTTUTOl 


603224HI 


BRSTTUTOl 


8255 19HI 


PROSNOT06 


1216374HI 


BRSTTUTOl 


604290H1 


BRSTTUTOl 


967077H1 


BRSTNOT05 


I217152H1 


BRSTTUTOl 


604954H1 


BRSTTUTOl 


1209955HI 


BRSTNOT02 






605120HI 


BRSTTUTOl 


1212005HI 


BRSTTUTOl 
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Table 2. Clones from which the consensus sequence (SEQ ID N0:4) of hSBP-C2 was derived. 



Sequence I.D. 


cDNA Library ||| Sequence I.D. 


— ■ 1 
cDNA Library 


1 

Sequence I.D. 


cDNA Library 


4I0758H1 


BRSTNOTOl 


899784 


BRSTTUT03 


968163HI 


BRSTNOT05 


4I9059H1 


BRSTNOTOI 






977969H1 


BRSTNOT02 


598065HI 


BRSTNOT02 


899895HI 


BRSTTUT03 


10000571 HI 


BRSTNOT03 


60I000H1 


BRSTNOT02 


90011 SHI 


BRSTTUT03 


1002776HI 


BRSTNOT03 


6026 15H1 


BRSTTUTOI 


901009HI 


BRSTTUT03 


I004904H1 


BRSTNOT02 


603548H1 


BRSTTUTOl 


902666H1 


BRSTTUT03 


1210748H1 


BRSTNOT02 


603234HI 


BRSTTUTOI 


902354HI 


BRSTTUT03 






603999HI 


BRSTTUTOl 


9592 13HI 


BRSTTUT03 


1212473HI 


BRSTTUTOI 


605093 HI 


BRSTTUTOI 


959506H1 


BRSTTUT03 


1213350H1 


BRSTTUTOI 


605204HI 


BRSTTUTOI 


960045HI 


BRSTTUT03 


I213570H1 


BRSTTUTOI 


6052 !5H1 


BRSTTUTOl 


9601 I8H1 


BRSTTUT03 


I213702H1 


BRSTTUTOl 


605561 HI 


BRSTTUTOl 


960656H1 


BRSTTUT03 


I2I4253HI 


BRSTTUTOl 


606I9IH1 


BRSTTUTOI 


962I53HI 


BRSTTUT03 


1214304HI 


BRSTTUTOI 


606289HI 


BRSTTUTOI 


962283HI 


BRSTTUT03 


I2I440IH1 


BRSTTUTOI 


60661 IHl 


BRSTTUTOI 


962488H1 


BRSTTUT03 


I215366HI 


BRSTTUTOI 


606664 H 1 


BRSTTUTOI 


962656HI 


bKb 1 1 U 1 




DIvo 1 1 U 1 U I 


607089H1 


BRSTTUTOI 


962907HI 


BRSTTUT03 


I216546H1 


BRSTTUTOI 


897552H1 


BRSTNOT05 


963043 HI 


BRSTTUT03 


1216653HI 


BRSTTUTOI 


8985 16H1 


BRSTTUT03 


963046H1 


BRSTTUT03 


1216659H1 


BRSTTUTOI 


898821 HI 


BRSTTUT03 


964108H1 


BRSTTUT03 


1216778HI 


BRSTTUTOl 


899628HI 


BRSTTUT03 


968127HI 


BRSTNOT05 







The nucleic acid sequence of SEQ ID N0:2 encodes the hSBPl amino acid sequence, SEQ ID 
NO: 1 . The nucleic acid sequence of SEQ ID NO:4 encodes the hSBP2 amino acid sequence, 
SEQ ID N0:3. 

The present invention is based, in part, on the chemical and structural homology between: 
1) The amino acid sequence of hSBPl and rat prostatic binding protein CI (GI 206442; Delaey et 
al. 1983 Eur J Biochem 133:645-649) rat prostatic binding protein C2 (Delaey et al. 1987 Nucl 
Acid Res 15:1627-1641 and rabbit uteroglobin (Menne et al. 1982 Proc Natl Acad Sci USA 
79:4853-4857; Figure 5) and the amino acid sequences of hSBP2, human mammaglobin (GI 
1 100595; SEQ ID NO: 10) and rat prostatic binding protein C3 (GI 206453: SEQ ID N0:12; 
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Figure 6); and 2) The nucleotide sequence encoding hSBPK rat prostatic binding protein CI 
(GI 206442; Delaey et al. supra), and hamster FHG22 (GI 1045204; Dominguez 1995 FEBS 
Letters 376:257-263; Figures 7A and 7B); and hSBP2, human mammaglobin (GI 1 199595; 
Watson et ai. 1996 Cancer Res 56:860-865), and rat prostatic binding protein C3 (GI 206452; 
Parker et al. 1983 J Biol Chem 258:12-15) (Figures 8A and 8B). 

Rat prostatic binding protein (rPBP) is a tetrameric, steroid-binding glycoprotein found in 
rat ventral prostate, and is the principal protein in rat prostatic fluid (Delaey et al. supra; Parker et 
al. supra; Heyns et al. 1977 Eur J Biochem 78:221-230; Heyns et al. 1977 Biochem Biophys Res 
Commun 77:1492-1499; Parker et ai. 1978 Eur J Biochem 85:399-406). The rPBP tetramer is 
composed of two subunits: one subunit containing the polypeptides CI and C3; and the other 
subunit containing the polypeptides C2 and C3 (Heyns et al. 1978 Eur J Biochem 89:181-186). 
rPBP C3 is homologous to human mammaglobin. which in turn is homologous to human Clara 
cell 10-kilodalton protein and rabbit uteroglobin (Watson et al. supra). 

Although rat PBP is primarily expressed in the testes (Lindzey et al. 1994 Vitamins 
Hormones 49:383-32), transgenic animals harboring a construct containing the 5* flanking region 
of the rat PBP-C3 gene linked to the coding region for the simian virus 40 large tumor antigen 
express the transgene in both the prostate and the mammary gland (Allison et al. 1989 Mol Cell 
Biol 9(5): 2254-2257). The expression of the C3 transgene varies with the sex of the transgenic 
animal; male transgenic animals express the rat PBP C3 transgene in the prostate and develop 
prostate carcinoma, while the females express the transgene in the mammary gland and develop 
atypical mammar>' hyperplasia (Maroulakou et al. 1994 Proc Natl Acad Sci USA 91:1 1236-40). 
Expression of rPBP is regulated by androgenic steroid (e.g., testosterone) partly by stimulating 
rates of transcription and partly by effects on RNA stability (Parker et al. 1977 Cell 12:401-407; 
Heyns ct al. 1977 Biochem Biophys Res Commun 77:1492-1499; Parker et ai. 1979 Proc Natl 
Acad Sci USA 76:1580-1 584; Page et al. 1982 Mol Cell Endocr 27:343-355). 

rPBP is similar to estramucine binding protein (EMBP) (Heyns et al. 1977 Eur J Biochem 
78:221-30). EMBP is a 46-kDa heterodimer consisting of two closely related subunits, which 
upon reductive cleavage of disulfide bridges, each subunit is divided into two components. The 
subunits differ with respect to the components CI and C2, but share C3'(Bjork et al. 1995 The 
Prostate (1995) 27:70-83). EMBP binds estramucine (Appelgren et al. 1979 Acta Pharmacol 
Toxicol 43:368-74; Forsgren et al. 1979 Cancer Res 39:5155-64; Hoisaeter et al. 1981 J Steroid 
Biochem 14: 251-60), but does not bind free estrogens (Hoisaeter et al. 1981 J Steroid Biochem 
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14:251-260; Forsgren et al. 1979 Proc Natl Acad Sci USA 76:3149-3150). Estramucine, a 
nitrogen mustard derivative of 1 7p-estradiol (Mittelman et al, 1977 Cancer Treat Rep 61 :307-10; 
Johnson el al. 1971 Scand J Urol Nephrol 5:103-7), is used to treat patients with prostatic 
carcinoma. Expression of EMBP is androgen-regulated; this androgen-dependency of EMBP 
5 tends to decline with the transformation of prostatic tissue into biologically more malignant 
disease (Shiina et al. 1996 Brit J Urol 77:96-101). The ratio of EMBP to dihydroxyteslosterone 
is an indicator of the malignant potential of prostatic carcinoma (Shiina et al. supra). 

Rabbit uteroglobin, a homodimeric protein coupled by two disulfide linkages, binds 
progesterone and structurally related steroids, is also a substrate for transglutaminases, inhibits 
10 phospholipase A2 activity, and may interfere with the immune and inflammatory activity of 

several cell types (Miele el al. 1994 J Endocrinol Invest 17:679-692; Miele el al. 1987 Endocrinol 
Rev 8 :474-490). Expression of uteroglobin is regulated by tissue-specific response to steroid 
hormones (SandmoUer el al. 1994 Oncogene 9.2805-2815). 

FHG22 protein was isolated from a female minus male subtracted cDNA library obtained 
15 from the sexually dimorphic Syrian hamster Harderian glands (Dominguez supra). FHG 

nucleotide and amino acid sequence are similar to the subunits from rat prostatic steroid binding 
protein CL uteroglobin (Miele el al. 1994 J Endocrinol Invest 17:679-692), major cat allergen 
Pel dl (chain I), and mouse salivary androgen binding proteins (subunit a) (Kam el al. 1993 
Biochem Genet 32:271-277; Dominguez supra). Expression of FHG22 is tissue and sex- 
2 0 dependent (Dominguez supra). 

hSBPl and rat prostatic binding protein CI share 55% nucleotide sequence identity at the 
nucleotide sequence level, whereas hSBPl and hamster FHG22 share 72% nucleotide sequence 
identity. hSBPl is 90 amino acids in length; the amino acid sequence of hSBPl has 49% identity 
with the amino acid sequence of rat prostatic binding protein CI (SEQ ID N0:5), 44% identity 
25 with the amino acid sequence of rat prostatic binding protein C2 (SEQ ID N0:8), and 28% 
identity with the amino acid sequence of rabbit uteroglobin (SEQ ID N0:9) (Figure 5). 

hSBP2 is 93 amino acids in length and shares 99% nucleotide sequence identity with 
human mammaglobin; the nucleotide sequence of hSBP2 is about 43% identical to the nucleotide 
sequence of rat prostatic binding protein C3 (Figures 8 A and 8B). The amino acid sequence of 
30 hSBP2 is 62% identical to the amino acid sequence of rat prostatic protein C3, and 100% 
identical to the amino acid sequence of human mammaglobin (Figure 6). Thus, hSBP-C3 is 
identical to human manunaglobin. 
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The hSBP Coding Sequences 

The nucleic acid and deduced amino acid sequences of hSBP are shown in Figures 1 
(hSBPl) and 2 A and 2B (hSBP2). In accordance with the invention, any nucleic acid sequence 
that encodes an amino acid sequence of an hSBP polypeptide can be used to generate 
5 recombinant molecules which express an hSBP polypepude. In specific embodiments described 
herein, a nucleotide sequence encoding a portion of hSBPl was first isolated as Incyte Clone 
606491 from a breast tumor cell line cDNA library BRSTTUTOl ; and a nucleotide sequence 
encoding a portion of hSBP2 was first isolated as Incyte Clone 602615 from a breast tumor cell 
line cDNA librar>' BRSTTUTOl. 

10 It will be appreciated by those skilled in the art that as a result of the degeneracy of the 

genetic code, a multitude of degenerate variants of hSBP-encoding nucleotide sequences, some 
bearing minimal homology to the nucleotide sequences of any known and naturally occurring 
gene, can be produced. The invention contemplates each and every possible variation of 
nucleotide sequence that can be made by selecting combinations based on possible codon 

15 choices. These combinations are made in accordance with the standard triplet genetic code as 
applied to the nucleotide sequence of naturally occurring hSBP, and all such variations are to be 
considered as being specifically disclosed herein. 

Although nucleotide sequences that encode hSBP and its variants are preferably capable 
of hybridizing to the nucleotide sequence of the naturally occurring hSBP under appropriately 

20 selected conditions of stringency, it may be advantageous to produce nucleotide sequences 

encoding hSBP or its derivatives possessing a substantially different codon usage. Codons can 
be selected to increase the rate at which expression of the peptide occurs in a particular 
prokaryotic or eukaryotic expression host in accordance with the frequency with which particular 
codons are utilized by the host. Other reasons for substantially altering the nucleotide sequence 

25 encoding hSBP and its derivatives without altering the encoded amino acid sequences include the 
production of RNA transcripts having more desirable properties (e.g., increased half-life) than 
transcripts produced from the naturally occurring sequence. 

It is now possible to produce a nucleotide sequence encoding an hSBP polypeptide and/or 
its derivatives entirely by synthetic chemistry, after which the synthetic gene can be inserted into 

3 0 any of the many available DNA vectors and expression systems using reagents that are well 

known in the art at the time of the filing of this application. Moreover, synthetic chemistr>' can 
be used to introduce mutations into a sequence encoding an hSBP polypeptide. 

-10- 
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Also included within the scope of the present invention are polynucleotide sequences that 
are capable of hybridizing to the nucleotide sequences of Figures 1 A-B and/or 2A-B under 
various conditions of stringency. Hybridization conditions are based on the melting temperature 
(Tm) of the nucleic acid binding complex or probe, as taught in Berger and Kimmel (1987, 
5 Guide to Molecular Clonine Techniques . Methods in Enzvmologv , Vol 1 52, Academic Press, 
San Diego CA) incorporated herein by reference, and can be used at a defined stringency. 

Altered nucleic acid sequences encoding hSBP that can be used in accordance with the 
invention include deletions, insertions or substitutions of different nucleotides resulting in a 
polynucleotide that encodes the same or a functionally equivalent hSBP. The protein can also 
10 comprise deletions, insertions or substitutions of amino acid residues that result in a polypeptide 
that is functionally equivalent to hSBP. Deliberate amino acid substitutions can be made on the 
basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicily. and/or the 
amphipathic nature of the residues with the proviso that biological activity of hSBP is retained. 
For example, negatively charged amino acids include aspartic acid and glutamic acid; positively 
15 charged amino acids include lysine and arginine; and amino acids with uncharged polar head 

groups having similar hydrophilicity values include leucine, isoleucine, valine; glycine, alanine; 
asparagine, glutamine; serine, threonine phenylalanine, and tyrosine. 

Alleles of hSBP are also encompassed by the present invention. As used herein, an 
"allele" or '"allelic sequence" is an alternative form of hSBP. Alleles result from a mutation (i.e.. 

2 0 an alteration in the nucleic acid sequence) and generally produce altered mRNAs and/or 

polypeptides that may or may not have an altered structure or function relative to naturally- 
occurring hSBP. Any given gene may have none, one, or many allelic forms. Common 
mutational changes that give rise to alleles are generally ascribed to natural deletions, additions 
or substitutions of amino acids. Each of these types of changes may occur alone or in 
25 combination with the other changes, and may occur once or multiple times in a given sequence. 

Methods for DNA sequencing are well known in the art and employ such enzymes as the 
Klenow fragment of DNA polymereise I, Sequenase® (US Biochemical Corp, Cleveland OH)), 
Taq polymerase (Perkin Elmer, Norwalk CT), thermostable T7 polymerase (Amersham, Chicago 
IL), or combinations of recombinant polymerases and proofreading exonucleases such as the 

3 0 ELONGASE Amplification System marketed by Gibco BRL (Gaithersburg MD). Preferably, the 

process is automated with machines such as the Hamilton Micro Lab 2200 (Hamilton, Reno NV), 
Peltier Thermal Cycler (PTC200; MJ Research. Watertown MA) and the ABI 377 DNA 
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sequencers (Perkin Elmer). 

Extending the Polynucleotide Sequence 

The polynucleotide sequence encoding hSBP can be extended utilizing partial nucleotide 
sequence and various methods known in the art to detect upstream sequences such as promoters 
5 and regulatory elements. Clones that contain extended sequences are designated by a suffix (see 
the tables above). Gobinda et al (1993; PGR Methods Applic 2:318-22) disclose 
"restriction-site" polymerase chain reaction (PGR) as a direct method which uses universal 
primers to retrieve unknown sequence adjacent to a known locus. First, genomic DNA is 
amplified in the presence of primer to a linker sequence and a primer specific to the known 

10 region. The amplified sequences are subjected to a second round of PGR with the same linker 
primer and another specific primer internal to the first one. Products of each round of PGR are 
transcribed with an appropriate RNA polymerase and sequenced using reverse transcriptase. 

Inverse PGR can be used to amplify or extend sequences using divergent primers based 
on a known region (Triglia T et al (1988) Nucleic Acids Res 16:8186). The primers can be 

15 designed using OLIGO® 4.06 Primer Analysis Software ( 1 992; National Biosciences Inc, 

Plymouth MN), or another appropriate program, to be 22-30 nucleotides in length, to have a GG 
content of 50% or more, and to anneal to the target sequence at temperatures about 68°-72° C. 
This method uses several restriction enzymes to generate a suitable fragment in the known region 
of a gene. The fragment is then circularized by intramolecular ligation and used as a PGR 

20 template. 

Capture PGR (Lagerstrom M et al ( 1 99 1 ) PGR Methods Applic 1 : 1 1 1 - 1 9) is a method for 
PGR amplification of DNA fragments adjacent to a known sequence in human and yeast artificial 
chromosome DNA. Gapture PGR also requires multiple restriction enzyme digestions and 
ligations to place an engineered double-stranded sequence into an unknown portion of the DNA 

2 5 molecule before PGR, 

Another method that can be used to retrieve unknown sequences is that of Parker JD et al 
(1991; Nucleic Acids Res 19:3055-60). Additionally, one can use PGR, nested primers, and 
PromoterFinder libraries to *\valk in" genomic DNA (PromoterFinder™ Glontech (Palo Alto 
GA). This process avoids the need to screen libraries and is useful in finding intron/exon 

3 0 junctions. Preferably, the libraries used to identify full length cDNAs have been size-selected to 

include larger cDNAs. More preferably, the cDNA libraries used to identify full-length cDNAs 
are those generated using random primers, in that such libraries will contain more sequences 

-12- 



wo 98/21331 



PCTAJS97/20674 



comprising regions 5' of the sequence(s) of interest. A randomly primed library can be 
particularly useful where oligo d(T) libraries do not yield a full-length cDNA. Genomic libraries 
arc preferred for identification and isolation of 5' nontranslated regulatory regions of a 
sequence(s) of interest. 

5 Capillary electrophoresis can be used to analyze the size of, or confirm the nucleotide 

sequence of, sequencing or PCR products. Systems for rapid sequencing are available from 
Perkin Elmer, Beckman Instruments (Fullerton CA), and other companies. Capillary sequencing 
can employ flowable polymers for electrophoretic separation, four different, laser-activatable 
fluorescent dyes (one for each nucleotide), and a charge coupled device camera for detection of 

10 the wavelengths emitted by the fluorescent dyes. Output/light intensity is converted to electrical 
signal using appropriate software (e.g. Genotyper^'^ and Sequence Navigator^"^ from Perkin 
Elmer). The entire process from loading of the samples to computer analysis and electronic data 
display is computer controlled. Capillary electrophoresis is particularly suited to the sequencing 
of small pieces of DNA that might be present in limited amounts in a particular sample. 

15 Capillary electrophoresis provides reproducible sequencing of up to 350 bp of M 1 3 phage DNA 
in 30 min (Ruiz-Martinez MC et al (1993) Anal Chem 65:2851-2858). 
Expression of the Nucleotide Sequence 

In accordance with the present invention, polynucleotide sequences that encode hSBP 
polypeptides (which polypeptides include fragments of the naturally-occurring polypeptide. 

20 fusion proteins, and functional equivalents thereof) can be used in recombinant DNA molecules 
that direct the expression of hSBP in appropriate host cells. Due to the inherent degeneracy of 
the genetic code, other DNA sequences that encode substantially the same or a functionally 
equivalent amino acid sequence, can be used to clone and express hSBP. As will be imderstood 
by those of skill in the art. it may be advantageous to produce hSBP-encoding nucleotide 

25 sequences possessing non-naturally occurring codons. Codons preferred by a particular 
prokaryotic or eukaryotic host (Murray E et al (1989) Nuc Acids Res 17:477-508) can be 
selected, for example, to increase the rate of hSBP expression or to produce recombinant RNA 
transcripts having a desirable characleristic(s) (e.g., longer half-life than transcripts produced 
from naturally occurring sequence). 

3 0 The nucleotide sequences of the present invention can be engineered in order to alter an 

hSBP coding sequence for a variety of reasons, including but not limited to, alterations that 
facilitate the cloning, processing and/or expression of the gene product. For example, mutations 
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can be introduced using techniques that are well known in the art. e.g., site-directed mutagenesis 
to insert new restriction sites, alter glycosylation patterns, change codon preference, produce 
splice variants, etc. 

In another embodiment of the invention, a natural, modified, or recombinant 
polynucleotide encoding an hSBP polypeptide can be ligated to a heterologous sequence to 
encode a fusion protein. For example, where an hSBP polypeptide is to be used in a peptide 
library for screening and identification of inhibitors of hSBP activity, it may be desirable to 
provide the hSBP polypeptide in the peptide library as a chimeric hSBP protein that can be 
recognized by a commercially available antibody. A fusion protein can also be engineered to 
contain a cleavage site located between an hSBP polypeptide-encoding sequence and a 
heterologous polypeptide sequence, such that the hSBP polypeptide can be cleaved and purified 
away from the heterologous moiety. 

In an alternative embodiment of the invention, a nucleotide sequence encoding an hSBP 
polypeptide can be synthesized, in whole or in part, using chemical methods well known in the 
art (see Caruthers et al (1980) Nuc Acids Res Symp Ser 215-23, Horn et al(1980) Nuc Acids Res 
Symp Ser 225-32. etc). Alternatively, the polypeptide itself can be produced using chemical 
methods to synthesize an hSBP amino acid sequence, in whole or in part. For example, peptide 
synthesis can be performed using various solid-phase techniques (Roberge et al (1995) Science 
269:202-204) and automated synthesis can be achieved, for example, using the ABI 431 A 
Peptide Synthesizer (Perkin Elmer) in accordance with the instructions provided by the 
manufacturer. 

The newly synthesized peptide can be substantially by preparative high performance 
liquid chromatography (e.g., Creighton (1983) Proteins . Structures and Molecular Principles, WH 
Freeman and Co, New York NY). The composition of the synthetic peptides can be confirmed 
by amino acid analysis or sequencing (e.g., the Edman degradation procedure; Creighton. supra). 
Additionally the amino acid sequence of hSBP, or any part thereof, can be altered during direct 
synthesis and/or combined using chemical methods with sequences from other proteins, or any 
part thereof, to produce a variant polypeptide. 
Expression Systems 

In order to express a biologically active hSBP polypeptide, the nucleotide sequence 
encoding an hSBP polypeptide or its functional equivalent, is inserted into an appropriate 
expression vector, i.e., a vector having the necessary elements for the transcription and translation 
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of the inserted coding sequence. 

Methods well known to those skilled in the art can be used to construct expression vectors 
comprising an hSBP poiypeptide-encoding sequence and appropriate transcriptional or 
iranslational controls. These methods include in vitro recombinant DN A techniques, synthetic 
5 techniques and in vivo recombination or genetic recombination. Such techniques are described in 
Sambrook et al (1 989) Molecular Cloning . A Laboratory Manual , Cold Spring Harbor Press, 
Plainview NY and Ausubel FM et al ( 1 989) Current Protocols in Molecular Biolpgy, John Wiley 
& Sons. New York NY, 

A variety of expression vector/host systems can be utilized to express an hSBP 

10 poiypeptide-encoding sequence. These include, but are not limited to, microorganisms such as 
bacteria transformed with recombinant bacteriophage, plasmid or cosmid DNA expression 
vectors; yeaisl transformed with yeast expression vectors; insect cell systems infected with virus 
expression vectors (e.g., baculovirus): plant cell systems transfected with virus expression vectors 
(e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or transformed with 

15 bacterial expression vectors (e.g., Ti or pBR322 plasmid); or animal cell systems. 

The "'control elements" or "'regulatory sequences" of these systems, which vary in their 
strength and specificities, are those nontranslated regions of the vector, enhancers, promoters, and 
3' untranslated regions that interact with host cellular proteins to facilitate transcription and 
translation of a nucleotide sequence of interest. Depending on the vector system and host 

20 utilized, anv number of suitable transcriptional and translational elements, including constitutive 
and inducible promoters, can be used. For example, when cloning in bacterial systems, inducible 
promoters such as the hybrid lacZ promoter of the Bluescript® phagemid (Stratagene, La Joila 
CA) or pSportl (Gibco BRL), ptrp-lac hybrids, and the like can be used. The baculovirus 
polyhedron promoter can be used in insect cells. Promoters or enhancers derived from the 

25 genomes of plant cells (e.g., heat shock, RUBISCO: and storage protein genes) or from plant 
viruses (e.g., viral promoters or leader sequences) can be cloned into the vector. In mammalian 
cell systems, promoters from the mammalian genes or from mammalian viruses are most 
appropriate. Where it is desirable to generate a cell line containing multiple copies of an hSBP 
poiypeptide-encoding sequence, vectors derived from SV40 or EBV can be used in conjunction 

3 0 with other optional vector elements, e.g., an appropriate selectable marker. 

In bacterial systems, a number of expression vectors can be used to express an hSBP 
polypeptide of interest, and will vary with a variety of factors including the intended use intended 
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for the hSBP polypeptide produced. For example, when large quantities of an hSBP polypeptide 
are required (e.g., for the antibody production), vectors that direct high-level expression of fusion 
proteins that can be readily purified may be desirable. Such vectors include, but are not limited 
to, the multifunctional E. coh cloning and expression vectors such as Bluescript® (Stratagene; 
5 which provides for in-frame ligation of a hSBP polypeptide-encoding sequence with sequences 
encoding the amino-terminal Met and the subsequent 7 residues of B-galactosidase, thereby 
producing an hSBP polypeptide-B-galactosidase hybrid protein); pIN vectors (Van Heeke & 
Schuster (1989) J Biol Chem 264:5503-5509); and the like. pGEX vectors (Promega, Madison 
WI) can also be used to express foreign polypeptides as glutathione S-transferase (GST) ftision 
10 proteins, in general, such GST fusion proteins are soluble and can be easily purified from cell 
lysates by adsorption to glutathione-agarose beads followed by elution in the presence of free 
glutathione. GST fusion proteins can be designed to include heparin, thrombin or factor XA 
protease cleavage sites so that the cloned polypeptide of interest can be readily separated from the 
GST moiety. 

15 Where the host cell is yeast (e.g., Saccharomyces cerevisiae ) a number of vectors 

containing constitutive or inducible promoters such as alpha factor, alcohol oxidase and PGH can 
be used. For reviews, see Ausubel et al (supra) and Grant et al (1987) Methods in Enzymology 
153:516-544. 

Where plant expression vectors are used, the expression of an hSBP polypeptide-encoding 

2 0 sequence can be driven by any of a number of promoters. For example, viral promoters such as 

the 35S and 19S promoters of CaMV (Brisson et al (1984) Nature 310:51 1-514) can be used > 
alone or in combination with the omega leader sequence from TMV (Takamatsu et al (1987) 
EMBO J 6:307-31 1). Alternatively, plant promoters, such as the small subimit of RUBISCO 
(Coruzzi et al (1984) EMBO J 3:1671-1680; Broglie et al (1984) Science 224:838-843) or heat 
25 shock promoters (Winter J and Sinibaldi RM (1991) Results Probl Cell Differ 17:85-105), can be 
used. These constructs can be introduced into plant cells by direct DNA transformation or 
pathogen-mediated transfection. For reviews of such techniques, see Hobbs S or Murry LE in 
McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill New York NY, pp 
1 9 1 - 1 96 or Weissbach and Weissbach ( 1 988) Methods for Plant Molecular Biology , Academic 

3 0 Press, New York NY, pp 42 1 -463 . 

Alternatively, insect cell expression systems can be used to express an hSBP polypeptide.. 
In one such system, Autographa califomica nuclear polyhedrosis virus (AcNPV) is used as a 
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vector to express foreign genes in Spodoptera frugiperda cells or in Trichoplusia larvae. The 
hSBP polypeptide-encoding sequence can be cloned into a nonessential region of the virus, such 
as the polyhedron gene, and placed under control of the polyhedron promoter. Successful 
insertion of hSBP renders the polyhedron gene inactive and produces recombinant virus lacking 
5 coat protein. The recombinant viruses are then used to infect S. frugiperda cells or Trichoplusia 
larvae for expression of hSBP polypeptide (Smith et al (1983) J Virol 46:584; Engelhard EK et al 
(1994) Proc Nat Acad Sci 91:3224-7). 

Where the host cell is a mammalian cells, a number of viral-based expression systems can 
be used. For example, the expression vector can be derived from an adenovirus nucleotide 

10 sequence. An hSBP polypeptide-encoding sequence can be ligated into an adenovirus 

transcription/translation complex, which is composed of the late promoter and tripartite leader 
sequence. Insenion of the nucleotide sequence of interest into a nonessential El or E3 region of 
the viral genome will result in the production of a viable virus capable of expressing hSBP 
polypeptide in infected host cells (Logan and Shenk (1984) Proc Natl Acad Sci 81 :3655-59). In 

15 addition, transcriptional enhancers, such as the Rous sarcoma virus (RSV) enhancer, can be used 
to increase expression in mammalian host cells. 

Specific initiation signals may also be required for efficient translation of an hSBP 
polypeptide-encoding sequence, e.g., the ATG initiation codon and flanking sequences. Where a 
native hSBP polypeptide encoding sequence, its initiation codon and upstream sequences are 

20 inserted into the appropriate expression vector, no additional translational control signals may be 
needed. However, where only coding sequence, or a portion thereof, is inserted in an expression 
vector, exogenous transcriptional control signals including the ATG initiation codon must be 
provided. Furthermore, the initiation codon must be in the correct reading frame to ensure 
transcription of the entire insert. Exogenous transcriptional elements and initiation codons can be 

2 5 derived from various origins, and can be either natural or synthetic. Expression efficiency can be 
enhanced by including enhancers appropriate to the cell system in use (Scharf D et al (1994) 
Results Probl Cell Differ 20:125-62; Bittner et al (1987) Methods in Enzymol 153:516-544). 

Host cells can be selected for hSBP polypeptide expression according to the ability of the 
cell to modulate the expression of the inserted sequences or to process the expressed protein in a 

30 desired fashion. Such modifications of the polypeptide include, but are not limited to, 
acetylation, carboxylation, glycosylation, phosphorylation, lipidation and acylation. 
Post-translational processing that involves cleavage of a "prepro" form of the protein may also be 

-17- 



wo 98/21331 



PCTAJS97/20674 



important for correct polypeptide folding, membrane insertion, and/or function. Host cells such 
as CHO, HeLa, MDCK, 293. WI38, and others have specific cellular machinery and 
characteristic mechanisms for such post-translational activities and may be chosen to ensure the 
correct modification and processing of the introduced, foreign polypeptide. 

Where long-term, high-yield recombinant polypeptide production is desired, stable 
expression is preferred. For example, cell lines that stably express hSBP can be transformed 
using expression vectors containing viral origins of replication or endogenous expression 
elements and a selectable marker gene. After introduction of the vector, cells can be grown for 
1-2 days in an enriched media before they are exposed to selective media. The selectable marker, 
which confers resistance to the selective media, allows grovnh and recovers' of cells that 
successfully express the introduced sequences. Resistant, stably transformed cells can be 
proliferated using tissue culture techniques appropriate to the host cell type. 

Any number of selection systems can be used to recover transformed cell lines. These 
include, but are not limited to, the herpes simplex virus thymidine kinase (Wigler M et al (1977) 
Cell 1 1:223-32) and adenine phosphoribosyltransferase (Lowy I et al (1980) Cell 22:817-23) 
genes which can be employed in tk- or aprt- cells, respectively. Also, antimetabolite, antibiotic 
or herbicide resistance can be used as the basis for selection; for example, dhfr which confers 
resistance to methotrexate (Wigler M et al (1980) Proc Natl Acad Sci 77:3567-70); npt, which 
confers resistance to the aminoglycosides neomycin and G-418 (Colbere-Garapin F et al (1981) J 
Mol Biol 150:1-14) and als or pat, which confer resistance to chlorsulfuron and phosphinotricin 
acetyltransferase. respectively (Murry, supra). Additional selectable genes have been described, 
for example, trpB. which allows cells to utilize indole in place of tryptophan, or hisD, which 
allows cells to utilize histinol in place of histidine (Hariman SC and RC Mulligan (1988) Proc 
Natl Acad Sci 85:8047-51). Recently, the use of visible markers has gained popularity with such 
markers as anthocyanins, B-glucuronidase and its substrate, GUS, and luciferase and its substrate, 
luciferin, being widely used not only to identify transformants, but also to quantify the amount of 
transient or stable protein expression attributable to a specific vector system (Rhodes CA et al 
(1995) Methods Mol Biol 55:121-131). 

Identification of Transformants Containing the Polynucleotide Sequence 

Although the presence/absence of marker gene expression suggests that the gene of 
interest is also present, its presence and expression should be confirmed. For example, if the 
hSBP polypeptide encoding sequence is inserted within a marker gene sequence, recombinant 
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cells containing this sequence can be identified by the absence of marker gene function. 
Alternatively, a marker gene can be placed in tandem with a hSBP sequence under the control of 
a single promoter. Expression of the marker gene in response to induction or selection is 
indicative of expression of the tandem hSBP. 
5 Alternatively, host cells that contain the coding sequence for hSBP polypeptides and 

express hSBP polypeptides can be identified by a variety of procedures known to those of skill in 
the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridization 
and protein bioassay or immunoassay techniques including membrane, solution, or chip-based 
technologies for the detection and/or quantitation of the nucleic acid or protein. 

10 The presence of the polynucleotide sequence encoding hSBP polypeptides can be detected 

by DNA-DNA or DNA-RNA hybridization or amplification using probes, portions or fragments 
of polynucleotides encoding hSBP. Nucleic acid amplification-based assays involve the use of 
oligonucleotides or oligomers based on the hSBP polypeptide-encoding sequence to detect 
transformants containing hSBP polypeptide-encoding DNA or RNA. As used herein 

15 "oligonucleotides" or "oligomers" refer to a nucleic acid sequence of at least about 10 

nucleotides and as many as about 60 nucleotides, preferably about 15 to 30 nucleofides, and more 
preferably about 20-25 nucleotides which can be used as a probe or amplimer. 

A variety of protocols for detecting and measuring the expression of hSBP, using either 
polyclonal or monoclonal antibodies specific for the protein are known in the art. Examples 

20 include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA) and fluorescent 
activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing monoclonal 
antibodies reactive to two non-interfering epitopes on hSBP is preferred, but a competitive 
binding assay can be employed. These and other assays are described in. e.g., Hampton R et al 
( 1 990, Serological Methods , a Laboratory Manual . APS Press, St Paul MN) and Maddox DE et al 

25 (1983, J Exp Med 158:1211). 

A wide variety of detectable labels and conjugation techniques are known by in the art 
and can be used in various nucleic acid and amino acid assays. Means for producing labeled 
hybridization or PGR probes for detecting sequences related to hSBP-encoding polynucleotides 
include oligolabeling, nick translation, end-labeling or PGR amplification using a labeled 

3 0 nucleotide. Alternatively, an nucleotide sequence encoding an hSBP polypeptide can be cloned 
into a vector for the production of an mRNA probe. Such vectors, which are known in the art and 
commercially available, can be used to synthesize RNA probes in vitiQ by addition of an 
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appropriate RNA polymerase such as T7, T3 or SP6 and labeled nucleotides. 

A number of companies, including Pharmacia Biotech (Piscataway NJ), Promega 
(Madison WI), and US Biochemical Corp (Cleveland OH), supply commercial kits and protocols 
suitable for the methods described above. Suitable reporter molecules or labels include those 
radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents as well as 
substrates, cofactors, inhibitors, magnetic particles and the like, as described in U.S. Patent Nos. 
3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149 and 4,366,241, each of which 
are incorporated herein by reference. Recombinant immunoglobulins can be produced as 
according to U.S. Patent No. 4,816,567, incorporated herein by reference. 
Purification of hSBF 

Host cells transformed with a nucleotide sequence encoding an hSBP polypeptide can be 
cultured under conditions suitable for the expression and recovery of the hSBP polypeptide from 
cell culture. The polypeptide produced by a recombinant cell may be secreted or retained 
intracellularly depending on the sequence and/or the vector used. As will be understood by those 
of skill in the art. expression vectors containing polynucleotides encoding hSBP polypeptides can 
be designed with signal sequences that direct secretion of hSBP through a prokaryotic or 
eukaryotic cell membrane. 

Recombinant hSBP constructs can also include a nucleotide sequence(s) encoding one or 
more polypeptide domains that, when expressed in-frame with the hSBP-encoding sequence, 
facilitates purification of soluble proteins (Kroll DJ et al (1993) DNA Ceil Biol 12:441-53; c.f 
discussion of vectors infra containing fusion proteins). Such purification facilitating domains 
include, but are not limited to, metal chelating peptides (e.g., histidine-tryptophan modules) that 
allow purification with immobilized metals, protein A domains that allow purification with 
immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity 
purification system (Immunex Corp, Seattle WA). A cleavable linker sequences(s) (e.g.. Factor 
XA or enterokinase (Invitrogen, San Diego CA)) between the purification domain and the hSBP 
polypeptide-encoding sequence can be included to facilitate purification. One such expression 
vector provides for expression of a fusion protein compromising 6 histidine residues followed by 
thioredoxin and an enterokinase cleavage site. The histidine residues facilitate purification on 
IMIAC (immobilized metal ion affinity chromatography as described in Porath et al (1992) 
Protein Expression and Purification 3: 263-281), while the enterokinase cleavage site provides a 
means for separating the hSBP domain from the remainder of the fusion protein. 
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hSBP polypeptides (which polypeptides encompass polypeptides composed of a portion 
of the native hSBP amino acid sequence) can also be produced by direct peptide synthesis using 
solid-phase techniques (cf Stewart et al (1969) Solid-Phase Peptide Synthesis . WH Freeman Co, 
San Francisco: Merrifield J (1963) J Am Chem Soc 85:2149-2154). in yitro protein synthesis 
5 can be performed using manual techniques or by automation. Automated synthesis can be 
achieved by, for example, using Applied Biosystems 431 A Peptide Synthesizer (Perkin Elmer, 
Foster City CA) in accordance with the instructions provided by the manufacturer. Various 
fragments of hSBP can be chemically synthesized separately and combined using chemical 
methods to produce the full length molecule. 
10 UsesofhSBP 

The rationale for use of the nucleotide and polypeptide sequences disclosed herein is 
based in part on the differential expression of hSBP-encoding sequences in breast tumor tissue 
and in part on the chemical and structural homology between the hSBP proteins disclosed herein 
and chemical and structural homology between: 1) hSBPL rat prostatic binding proteins CI 

15 (GI 206442; Delaey et al. supra), rat prostatic binding protein C2(Delaey et al. 1987 Nucl Acid 
Res 15:1627-1641) and rabbit uteroglobin (Menne et al. 1982 Proc Natl Acad Sci USA 79:4853- 
4857) (Figure 5), and 2) hSBP2, human mammaglobin (GI 11 99595; Watson et al. supra):and rat 
prostatic binding protein C3 (GI 206543; Parker et al. supra) (Figure 6). 

Accordingly, hSBP or an hSBP derivative can be used in the diagnosis and management 

20 of breast cancer. Given the homology of hSBP with rat PBP. and the differential expression of 
hSBP in human breast tumor tissue, hSBP can be used as a diagnostic marker for human breast 
cancer. Expression of rat PBP is regulated by androgens (Muder et al. 1984 Biochem Biophys 
Acta 781:121-9; Page et al. 1983 Cell 32:495-502) and by growth hormone (Reiter et al. 1995 
Endocrinol 166: 3338-44). Thus the level of hSBP can serve as a marker for transformation of 

25 normal breast cells into cancerous cells. Alternatively, or in addition, development of breast 
cancer can be detected by examining the ratio of hSBP to the levels of steroid hormones (e.g., 
testosterone or estrogen) or to other hormones (e.g., growth hormone, insulin). Thus expression 
of hSBPl and/or hSBP2 can also be used to discriminate between normal and cancerous breast 
tissue, to discriminate between different types of breast cancer, to provide guidance in selection 

3 0 of anti-cancer therapies, to monitor the progress of patients undergoing chemotherapy and/or 

other anti-cancer treatments, to determine the success of surgery to remove cancerous tissue, and 
to monitor patients who have had or are susceptible to breast cancer. In addition to diagnosis and 
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treatment of breast cancer after its development, detection of hSBP expression can be used to 
identify patients susceptible to breast cancer. Expression of hSBP in cancerous cells can be 
examined in breast tissue in situ or in pathology sections. Alternatively, if hSBP is secreted at 
sufficient levels, expression of hSBP can be assessed in blood, serum, or plasma. Assessment of 
5 levels of hSBP expression can be used to differentiate between normal and cancerous breast 

tissue, and/or different types of cancerous breast tissue (e.g., invasive vs. non-invasive; ductal vs. 
axillary lymph node). In addition, because hSBP is differentially expressed in breast tumor cells, 
hSBP polypeptides can serve as a target for anti-cancer therapy that is targeted to hSBP- 
expressing breast tumor cells. For example, cells can be transfected with antisense sequences to 
10 hSBP-encoding polynucleotides or provided with antagonists to hSBP to reduce or eliminate 
hSBP expression in cancerous breast cells. Alternatively, cancerous breast cells, or breast cells 
susceptible to cancer, can he transformed (e.g., via gene therapy techniques) with hSBP-encoding 
nucleic acid to provide for expression of excess hSBP and interruption of steroid binding. 
hSBP Antibodies 

15 hSBP-specific antibodies are useful for the diagnosis of conditions and diseases 

associated with expression of hSBP. Such antibodies include, but are not limited to, polyclonal, 
monoclonal, chimeric, single chain. Fab fragments and fragments produced by a Fab expression 
library. Neutralizing antibodies, i.e., those which inhibit a biochemical activity of hSBP, are 
especially preferred for diagnostics and therapeutics. 

20 hSBP polypeptides suitable for production of antibodies need not be biologically active; 

rather, the polypeptide, or oligopeptide need only be antigenic. Polypeptides used to generate 
hSBP-specific antibodies generally have an amino acid sequence consisting of at least five amino 
acids, preferably at least 10 amino acids. Preferably, antigenic hSBP polypeptides mimic an 
epitope of the native hSBP. Antibodies specific for short hSBP polypeptides can be generated by 

25 linking the hSBP polypeptide to a carrier, or fusing the hSBP polypeptide to another protein (e.g., 
keyhole limpet hemocyanin), and using the carrier-linked or hSBP chimeric molecule as an 
antigen. In general, anti-hSBP antibodies can be produced according to methods well known in 
the art. 

Various hosts, generally mammalian hosts, can be used to produce anti-hSBP antibodies 
3 0 (e.g., goats, rabbits, rats, mice). Anti-hSBP antibodies are produced by immunizing the host 
(e.g., by injection) with an hSBP polypeptide that retains immunogenic properties (which 
encompasses any portion of native hSBP. fragment or oligopeptide). Depending on the host 
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species, various adjuvants can be used to increase the host's immunological response. Such 
adjuvants include but are not limited to, Freund's, mineral gels (e.g., aluminum hydroxide), and 
surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil 
emulsions, keyhole limpet hemocyanin, and dinitrophenol. BCG (bacilli Calmette-Guerin) and 
5 Cnrvnebacterium parvum are potentially useful human adjuvants. 

Monoclonal anti-hSBP antibodies can be prepared using any technique that provides for 
the production of antibody molecules by immortalized cell lines in culture. These techniques 
include, but are not limited to, the hybridoma technique originally described by Koehler and 
Milstein (1975 Nature 256:495-497), the human B-cell hybridoma technique (Kosbor et al (1983) 

10 Immunol Today 4:72; Cote et al (1983) Proc Natl Acad Sci 80:2026-2030) and the 

EE V-hybridoma technique (Cole et al ( 1 985) Monoclonal Antibodies and Cancer Therapy , Alan 
R Liss Inc, New York NY, pp 77-96). 

In addition, techniques developed for the production of "chimeric antibodies", the splicing 
of mouse antibody genes to human antibody genes to obtain a molecule with appropriate antigen 

15 specificity and biological activity can be used (Morrison et al (1984) Proc Natl Acad Sci 
81:6851-6855; Neuberger et al (1984) Nature 312:604-608; Takeda et al (1985) Nature 
314:452-454). Alternatively, techniques described for the production of single chain antibodies 
(U.S. Patent No. 4,946,778) can be adapted to produce hSBP-specific single chain antibodies 
Antibodies can be produced in vivo or by screening recombinant immunoglobulin 

20 libraries or panels of highly specific binding reagents as disclosed in Orlandi et al (1989, Proc 
Natl Acad Sci 86: 3833-3837), and Winter G and Milstein C (1991; Nature 349:293-299). 

Antibody fragments having specific binding sites for an hSBP polypeptide can also be 
generated. For example, such fragments include, but are not limited to, F(ab*)2 fragments, which 
can be produced by pepsin digestion of the antibody molecule, and Fab fragments, which can be 

25 generated by reducing the disulfide bridges of the F(ab')2 fragments. Alternatively, Fab 

expression libraries can be constructed to allow rapid and easy identification of monoclonal Fab 
fragments with the desired specificity (Huse WD et al (1989) Science 256:1275-1281). 

A variety of protocols for competitive binding or immunoradiometric assays using either 
polyclonal or monoclonal antibodies having established antigen specificities are well known in 

3 0 the art. Such immunoassays typically involve the formation of complexes between an hSBP 
polypeptide and a specific anti-hSBP antibody, and the detection and quantitation of hSBP- 
antibody complex formation. A two-site, monoclonal-based immunoassay utilizing monoclonal 
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antibodies reactive to two noninterfering epitopes on a specific hSBP protein is preferred, but a 
competitive binding assay can also be employed. These assays are described in Maddox DE et al 
(1983JExpMed 158:1211). 

Diagnostic Assays Using hSBP Specific Antibodies 

Particular hSBP antibodies are useful for the diagnosis of conditions or diseases 
characterized by expression of hSBP (e.g., breast cancer) or in assays to monitor patients being 
treated with hSBP, agonists, antagonists, or inhibitors. Diagnostic assays for hSBP include 
methods using a detectably-labeled anti-hSBP antibody to detect hSBP in human body fluids or 
extracts of cells or tissues. The polypeptides and antibodies of the present invention can be used 
with or without modification. Frequently, the polypeptides and antibodies are labeled by 
covalent or noncovalent attachment to a reporter molecule. A wide variety of such suitable 
reporter molecules are known in the art. 

A variety of protocols for detection and quantifying hSBP. using either polyclonal or 
monoclonal antibodies specific for an hSBP polypeptide, are known in the art. Examples include 
enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA) and fluorescent 
activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing monoclonal 
antibodies reactive to two non-interfering epitopes on hSBP is preferred, but a competitive 
binding assay can instead be employed. These assays are described, among other places, in 
Maddox. DE et al (1983. J Exp Med 158: 1211). 

In order to provide a basis for diagnosis, normal or standard values for hSBP expression 
must be established. This is accomplished by combining body fluids or cell extracts taken from 
normal subjects, either animal or human, preferably human, v^th antibody to hSBP under 
conditions suitable for complex formation according to methods well known in the art. The 
amount of standard complex formation can be quantified by comparing detection levels 
associated with known quantities of hSBP with detection levels associated with both control and 
disease samples from biopsied tissues. Standard values obtained from normal samples are 
compared with values obtained from samples from subjects potentially affected by disease. 
Deviation between standard and subject values establishes the presence of disease state. 
Drug Screening 

hSBP polypeptides, which encompass biologically active or immunogenic fragments or 
oligopeptides thereof, can be used for screening therapeutic compounds in any of a variety of 
drug screening techniques. The polypeptide employed in such a test can be free in solution. 
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affixed to a solid support, borne on a ceil surface, or located intracellularly. The formation of 
binding complexes, between hSBP and the agent being tested, can be measured. 

Preferably, the drug screening technique used provides for high throughput screening of 
compounds having suitable binding affinity to the hSBP. as described in detail in ''Detennination 
of Amino Acid Sequence Antigenicity" by Geysen HN. WO Application 84/03564, published on 
September 13, 1984, and incorporated herein by reference. In summary, large numbers of 
different small peptide test compounds are synthesized on a solid substrate, such as plastic pins or 
some other surface. The peptide test compounds are reacted with hSBP polypeptides, unreacted 
materials are washed away, and bound hSBP is detected by methods well known in the art. 
Purified hSBP can also be coated directly onto plates for use in the aforementioned drug 
screening techniques. Alternatively, non-neutralizing antibodies can be used to capture the 
polypeptide and immobilize it on a solid support. 

The invention also contemplates the use of competitive drug screening assays in which 
hSBP-specific neutralizing antibodies compete with a test compound for binding of hSBP 
polypeptide. In this manner, the antibodies can be used to detect the presence of any polypeptide 
that shares one or more antigenic determinants with an hSBP polypeptide. 
Uses of the Polynucleotide Encoding hSBP 

A polynucleotide encoding an hSBP polypeptide (which polypeptides include native 
hSBP and fragments thereoO can be used for diagnostic and/or therapeutic purposes. For 
diagnostic purposes, polynucleotides encoding hSBP of this invention can be used to detect and 
quantitate gene expression in biopsied tissues in which expression of hSBP is implicated, 
particularly in diagnosis of breast cancer. The diagnostic assay is useful to assess hSBP 
expression levels (e.g., to distinguish between the absence, and presence or hSBP expression, as 
well as to assess various hSBP expression levels (e.g., excessively high, high, moderate, or low)) 
and to monitor regulation of hSBP levels during therapeutic intervention. Included in the scope 
of the invention are oligonucleofide sequences, antisense RNA and DNA molecules, and peptide 
nucleic acids (PNAs). 

Another aspect of the subject invention is to provide for hybridization or PCR probes 
capable of detecting polynucleotide sequences encoding hSBP, including genomic sequences and 
closely related molecules. The specificity of the probe, whether it is made from a highly specific 
region, e.g., 10 unique nucleotides in the 5* regulatory region, or a less specific region, e.g., 
especially in the 3' region, and the stringency of the hybridization or amplification (maximal. 
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high, intermediate or low) will determine whether the probe identifies only naturally occurring 
sequences encoding hSBP, alleles or related sequences. 

The probes of the invention can be used in the detection of related sequences; such probes 
preferably comprise at least 50% of the nucleotides from any of the hSBP polypeptide-encoding 
sequences described herein. The hybridization probes of the subject invention can be derived 
from the nucleotide sequence of SEQ ID N0:2 and SEQ ID N0:4. or from their corresponding 
genomic sequences including promoters, enhancer elements and introns of the naturally occurring 
hSBP-encoding sequences. Hybridization probes can be detectably labeled with a variety of 
reporter molecules, including radionuclides (e.g., 32? or 35S), or enzymatic labels (e.g., alkaline 
phosphatase coupled to the probe via avidin/biotin coupling systems), and the like. 

Specific hybridization probes for hSBP-encoding DNAs can also be produced by cloning 
nucleic acid sequences encoding hSBP or hSBP derivatives into vectors for production of 
mRNA probes. Such vectors, which are known in the art and are commercially available, can be 
used to synthesize RNA probes in yjtro using an appropriate RNA polymerase (e.g, T7 or SP6 
RNA polymerase) and appropriate radioactively labeled nucleotides. 
Diagnostic Use 

Polynucleotide sequences encoding hSBP polypeptide can be used in the diagnosis of 
conditions or diseases associated with hSBP expression, especially breast cancer. For example, 
polynucleotide sequences encoding hSBP can be used in hybridization or PGR assays of fluids or 
tissues from biopsies to detect hSBP expression. Suitable qualitative or quantitative methods 
include Southern or northern analysis, dot blot or other membrane-based technologies: PGR 
technologies; dip stick. pIN. chip and ELISA technologies. All of these techniques are well 
known in the art and are the basis of many commercially available diagnostic kits. 

The nucleotide sequences encoding hSBP disclosed herein provide the basis for assays 
that detect the onset of, susceptibility to, or the presence of breast cancer. Nucleotide sequences 
encoding hSBP polypeptides can be labeled by methods known in the art and combined with a 
fluid or tissue sample from a patient suspected of having or susceptible to breast cancer under 
conditions suitable for the formation of hybridization complexes. After an incubation period, the 
sample is washed with a compatible fluid which optionally contains a dye (or other label 
requiring a developer) if the nucleotide has been labeled with an enzyme. After the compatible 
fluid is rinsed off. the dye is quant itated and compared with a standard. If the amount of dye in 
the biopsied or extracted sample is significantly elevated over that of a comparable negative 
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control sample, the nucleotide sequence has hybridized with nucleotide sequences in the sample. 
The presence of hSBP-cncoding nucleotide sequences in the sample, particularly the presence of 
elevated levels of hSBP-encoding sequences, indicates that the patient has or is at risk of 
developing the associated disease. 

Such assays can also be used to evaluate the efficacy of a particular therapeutic treatment 
regime in animal studies or in clinical trials, or in monitoring the treatment of an individual 
patient. In order to provide a basis for the diagnosis of disease, a normal or standard profile for 
hSBP expression must be established. This is accomplished by combining body fluids or cell 
extracts taken from normal subjects, either animal or human, with hSBP, or a portion thereof, 
under conditions suitable for hybridization or amplification. Standard hybridization can be 
quantified by comparing, in the same experiment, the values obtained for normal subjects with 
those obtained with a dilution series of hSBP containing known amounts of substantially 
purified hSBP. Standard values obtained from normal samples are compared with values 
obtained from samples from patients afflicted with hSBP-associated diseases, or suspected of 
having such diseases (e.g., breast cancer). Deviation between standard and subject values is used 
to establish the presence of disease. 

Once disease is established, a therapeutic agent is administered and a treatment profile is 
generated. Such assays can be repeated on a regular basis to evaluate whether the values in the 
profile progress toward or return to a normal or standard pattern of hSBP expression. Successive 
treatment profiles can be used to show the efficacy of treatment over a period of several days or 
several months. 

Oligonucleotides based upon hSBP sequences can be used in PCR-based techniques, as 
described in U.S. Patent Nos. 4,683,195 and 4,965,188. Such oligomers are generally chemically 
synthesized, or produced enzymatically or by recombinantly. Oligomers generally comprise two 
nucleotide sequences, one with sense orientation (5'->3') and one with antisense (3'<-5'), 
employed under optimized conditions for identification of a specific gene or condition. The same 
two oligomers, nested sets of oligomers, or even a degenerate pool of oligomers can be employed 
under less stringent conditions for detection and/or quanfitation of closely related DN A or RNA 
sequences. 

Additional methods for quantitation of expression of a particular molecule according to 
the invention include radiolabeling (Melby PC et al 1993 J Immunol Methods 159:235-44) or 
biotinylating (Duplaa C et al 1993 Anal Biochem 229-36) nucleotides, coamplification of a 
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control nucleic acid, and interpolation of experimental results according to standard curves. 
Quantitation of multiple samples can be made more time efficient by running the assay in an 
ELISA format in which the oligomer of interest is presented in various dilutions and rapid 
quantitation is accomplished by spectrophotometric or colorimetric detection. For example, the 
presence of a relatively high amount of hSBP in extracts of biopsied tissues indicates the 
presence of cancerous breast cells. A definitive diagnosis of this type can allow health 
professionals to begin aggressive treatment and prevent further worsening of the condition. 
Similarly, further assays can be used to monitor the progress of a patient during treatment. 
Furthermore, the nucleotide sequences disclosed herein can be used in molecular biology 
techniques that have not yet been developed, provided the new techniques rely on properties of 
nucleotide sequences that are currently known such as the triplet genetic code, specific base pair 
interactions, and the like. 
Therapeutic Use 

Based upon its homology to genes encoding prostatic binding proteins, hSBP 
polypeptides and its expression profile in breast tumor cells, polynucleotide sequences encoding 
hSBP disclosed herein may be useful in the treatment of conditions such as breast cancer or other 
condition associated with hSBP expression or over-expression. 

Expression vectors derived from retroviruses, adenovirus, herpes or vaccinia viruses, or 
from various bacterial plasmids. can be used for delivery of nucleotide sequences to the targeted 
organ, tissue or cell population. Recombinant vectors for expression of antisense hSBP 
polynucleotides can be constructed according to methods well known in the art (see, for example, 
the techniques described in Sambrook et al (supra) and Ausubel et al (supra)). 

Polynucleotides comprising the full length cDN A sequence and/or its regulatory 
elements enable researchers to use sequences encoding hSBP as an investigative tool in sense 
(Youssoufian H and HF Lodish 1993 Mol Cell Biol 13:98-104) or antisense (Eguchi et al (1991) 
Ann Rev Biochem 60:63 1-652) regulation of gene function. Such technology is now well known 
in the art, and sense or antisense oligomers, or larger fragments, can be designed from various 
locations along the coding or control regions. 

Expression of genes encoding hSBP can be decreased by transfecling a cell or tissue with 
expression vectors that express high levels of a desired hSBP-encoding fragment. Such 
constructs can flood cells with untranslatable sense or antisense sequences. Even in the absence 
of integration into the DNA, such vectors can continue to transcribe RNA molecules until all 
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copies are disabled by endogenous nucleases. Transient expression can last for a month or more 
with a non-replicating vector (Mettler I, personal communication) and even longer if appropriate 
replication elements are part of the vector system. 

As mentioned above, modifications of gene expression can be obtained by designing 
5 antisense molecules, DNA, RN A or PNA, to the control regions of gene encoding hSBP (i.e., the 
promoters, enhancers, and introns). Oligonucleotides derived from the transcription initiation 
site, e.g., between -10 and +10 regions of the leader sequence, are preferred. The antisense 
molecules can also be designed to block translation of mRNA by preventing the transcript from 
binding to ribosomes. Similarly, inhibition of expression can be achieved using "triple helix" 

10 base-pairing methodology. Triple helix pairing compromises the ability of the double helix to 
open sufficiently for binding of polymerases, transcription factors, or regulatory molecules. 
Recent therapeutic advances using triplex DNA were reviewed by Gee JE et al (In: Huber BE and 
BI Carr (1994) Molecular and Immunologic Approaches , Futura Publishing Co, Mt Kisco NY). 
Ribozymes are enzymatic RNA molecules capable of catalyzing the specific cleavage of 

15 RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the 

ribozyme molecule to complementary target RNA, followed by endonucleolytic cleavage. The 
invention contemplates engineered hammerhead motif ribozyme molecules that can specifically 
and efficiently catalyze endonucleolytic cleavage of sequences encoding hSBP. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified 

20 by scanning the target molecule for ribozyme cleavage sites, which sites include the following 
sequences, GUA, GUU and GUC. Once identified, short RNA sequences between 15 and 20 
ribonucleotides corresponding to a region of the target gene containing the cleavage site can be 
evaluated for secondary structural features that can render the oligonucleotide inoperable. The 
suitability of candidate targets can also be evaluated by testing accessibility to hybridization with 

2 5 complementary oligonucleotides using ribonuclease protection assays. 

Antisense molecules and ribozymes of the invention can be prepared by methods known 
in the art for the synthesis of RNA molecules, including techniques for chemical oligonucleotide 
synthesis, e.g., solid phase phosphoramidite chemical synthesis. Alternatively, RNA molecules 
can be generated by in vitro and in vivo transcription of DNA sequences encoding hSBP. Such 

3 0 DNA sequences can be incorporated into a wide variety of vectors with suitable RNA polymerase 

promoters (e.g, T7 or SP6). Alternatively, antisense cDNA constructs useful in the constitutive 
or inducible synthesis of antisense RNA can be introduced into cell lines, cells, or tissues. 
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RNA molecules can be modified to increase intracellular stability and half-life. Possible 
modifications include, but are not limited to, the addition of flanking sequences at the 5* and/or 3' 
ends of the molecule, or the use of phosphorothioate or T 0-mcthyl rather than 
phosphodiesterase linkages within the backbone of the molecule. This concept is inherent in the 
production of PNAs and can be extended in all of these molecules by the inclusion of 
nontraditional bases such as inosine, queosine and wybutosine as well as acetyl-, methyl-, thio- 
and similarly modified forms of adenine, cytidine, guanine, thymine, and uridine that are not as 
easily recognized by endogenous endonucleases. 

Methods for introducing vectors into cells or tissues include those methods discussed 
infra and which are equally suitable for in vivo , in vitro and ex vivo therapy. In ex vivo therapy, 
vectors are introduced into stem cells obtained from the patient and clonally propagated for 
autologous transplant back into that same patient (see. e.g., U.S. Patent Nos. 5,399,493 and 
5,437.994. incorporated herein by reference). Transfection and by liposome methods for delivery 
of a nucleotide sequence of interest to accomplish gene therapy are well knovm in the art. 

Furthermore, the nucleotide sequences for hSBP disclosed herein can be used in 
molecular biology techniques that have not yet been developed, provided the new techniques rely 
on properties of nucleotide sequences that are currently known, including but not limited to such 
properties as the triplet genetic code and specific base pair interactions. 
Detection and Mapping of Related Polynucleotide Sequences 

The hSBP nucleic acid sequences can also be used to generate hybridization probes for 
mapping the naturally occurring genomic sequence. The sequence can be mapped to a particular 
chromosome or to a specific region of the chromosome using well knovm techniques. These 
include in situ hybridization to chromosomal spreads, flow-sorted chromosomal preparations, or 
artificial chromosome constructions such as yeast artificial chromosomes, bacterial artificial 
chromosomes- bacterial PI constructions or single chromosome cDNA libraries as reviewed in 
Price CM (1993; Blood Rev 7:127-34) and Trask BJ (1991; Trends Genet 7:149-54). 

The technique of fluorescent in situ hybridization of chromosome spreads is described in, 
for example, Verma et al (1988^ Human Chromosomes : A Manual of Basic Techniques , 
Pergamon Press. New York NY. Fluorescent in situ hybridization of chromosomal preparations 
and other physical chromosome mapping techniques can be correlated with additional genetic 
map data. Examples of genetic map data can be found in the 1994 Genome Issue of Science 
(265: 1 98 1 f)- Correlation between the location of a gene encoding hSBP on a physical 
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chromosomal map and a specific disease (or predisposition to a specific disease) can help delimit 
the region of DNA associated with that genetic disease. The nucleotide sequences of the subject 
invention can be used to detect differences in gene sequences between normal, carrier, or affected 
individuals. 

In silu hybridization of chromosomal preparations and physical mapping techniques such 
as linkage analysis using established chromosomal markers can be used for extending genetic 
maps. For example an sequence tagged site based map of the human genome was recently 
published by the Whitehead-MIT Center for Genomic Research (Hudson TJ et al (1995) Science 
270:1945-1954). Often the placement of a gene on the chromosome of another manmialian 
species such as a mouse (Whitehead Institute/MIT Center for Genome Research, Genetic Map of 
the Mouse, Database Release 10, April 28. 1995) can reveal associated markers even if the 
number or arm of a particular human chromosome is not known. New sequences can be assigned 
to chromosomal arms, or parts thereof, by physical mapping. Physical mapping provides 
valuable information to investigators searching for disease genes using positional cloning or other 
gene discovery techniques. Once a disease or syndrome, such as ataxia telangiectasia (AT), has 
been crudely localized by genetic linkage to a particular genomic region, for example, AT to 
llq22-23 (Gatti et al (1988) Nature 336:577-580), other sequences mapping to that area may 
represent associated or regulatory genes for further investigation. The nucleotide sequence of the 
subject invention can also be used to detect differences in the chromosomal location due to 
translocation, inversion, etc, among normal, carrier or affected individuals. 
Pharmaceutical Compositions 

The present invention relates to pharmaceutical compositions which can comprise 
nucleotides, proteins, antibodies, agonists, antagonists, or inhibitors, alone or in combination 
with at least one other agent, such as a stabilizing compound, which can be administered in any 
sterile, biocompatible pharmaceutical carrier, including, but not limited to. saline, buffered saline, 
dextrose, and water. Any of these molecules can be administered to a patient alone or in 
combination with other agents, drugs or hormones, in pharmaceutical compositions where it is 
mixed with excipient(s), or with pharmaceutical ly acceptable carriers. In one embodiment of the 
present invention, the pharmaceutically acceptable carrier is pharmaceutically inert. 
Administration of Pharmaceutical Compositions 

Administration of pharmaceutical compositions is accomplished orally or parenterally. 
Methods of parenteral delivery include topical, intra-arterial (e.g., directly to the breast tumor). 
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intramuscular, subcutaneous, intramedullary, intrathecal, intraventricular, intravenous, 
intraperitoneal, or intranasal administration. In addition to the active ingredients, these 
pharmaceutical compositions can contain suitable pharmaceutically acceptable carriers 
comprising excipients and auxiliaries that facilitate processing of the active compounds into 
preparations for pharmaceutical use. Further details on techniques for formulation and 
administration can be found in the latest edition of "Remington's Pharmaceutical Sciences" 
(Maack Publishing Co, Easton PA). 

Pharmaceutical compositions for oral administration can be formulated using 
pharmaceutically acceptable carriers v^ell known in the art in dosages suitable for oral 
administration. Such carriers enable the pharmaceutical compositions to be formulated as tablets, 
pills, dragees. capsules, liquids, gels, syrups, slurries, suspensions and the like, for ingestion by 
the patient. 

Pharmaceutical preparations for oral use can be obtained through combination of active 
compounds with solid excipient, optionally grinding a resulting mixture, and processing the 
mixture of granules, after adding suitable auxiliaries if desired, to obtain tablets or dragee cores. 
Suitable excipients are carbohydrate or protein fillers such as sugars, including lactose, sucrose, 
marmitol, or sorbitol; starch from com. wheat, rice, potato, or other plants: cellulose such as 
methyl cellulose, hydroxypropylmethyl-cellulose. or sodium carboxymethylcellulose; and gums 
including arabic and tragacanth: and proteins such as gelatin and collagen. If desired, 
disintegrating or solubilizing agents can be added, such as the cross-linked polyvinyl pyrrolidone, 
agar, alginic acid, or a salt thereof such as sodium alginate. 

Dragee cores are provided with suitable coatings such as concentrated sugar solutions, 
which can also contain gum arabic. talc, polyvinylpyrrolidone, carbopol gel, polyethylene glycol, 
and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
DyestufTs or pigments can be added to the tablets or dragee coatings for product identification or 
to characterize the quantity of active compound, i.e., dosage. 

Pharmaceutical preparations that can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a coating such as glycerol or sorbitol. 
Push-fit capsules can contain active ingredients mixed with a filler or binders such as lactose or 
starches, lubricants such as talc or magnesium stearate, and. optionally, stabilizers. In soft 
capsules, the active compounds can be dissolved or suspended in suitable liquids, such as fatty 
oils, liquid paraffin, or liquid polyethylene glycol with or without stabilizers. 
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Pharmaceutical formulations for parenteral administration include aqueous solutions of 
active compounds. For injection, the pharmaceutical compositions of the invention can be 
formulated in aqueous solutions, preferably in physiologically compatible buffers such as 
Hanks's solution. Ringer's solution, or physiologically buffered saline. Aqueous injection 
suspensions can contain substances that increase the viscosity of the suspension, such as sodium 
carboxymethyl cellulose, sorbitol, or dextran. Additionally, suspensions of the active compounds 
can be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles 
include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Optionally, the suspension can also contain suitable stabilizers or 
agents that increase the solubility of the compounds to allow for the preparation of highly 
concentrated solutions. 

For topical or nasal administration, penetrants appropriate to the particular barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art. 
iVIanufactiire and Storage 

The pharmaceutical compositions of the present invention can be manufactured in any 
suitable manner known in the art. e.g., by means of conventional mixing, dissolving, granulating, 
dragee-making, levigating, emulsifying, encapsulating, entrapping or lyophilizing processes. 

The pharmaceutical composition can be provided as a salt and can be formed with many 
acids, including but not limited to hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, 
etc. Salts tend to be more soluble in aqueous or other protonic solvents that are the 
corresponding free base forms. In other cases, the preferred preparation can be a lyophilized 
powder in lmM-50 mM histidine, 0.1%-2% sucrose, 2%-7% mannitol at a pH range of 4.5 to 5.5 
that is combined with buffer prior to use. 

After pharmaceutical compositions comprising a compound of the invention formulated 
in a acceptable carrier have been prepared, they can be placed in an appropriate container and 
labeled for treatment of an indicated condition. For administration of hSBP, such labeling would 
include amount, frequency and method of administration. 
Therapeuticallv Effective Dose 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve the 
intended purpose. ITie determination of an effective dose is well within the capability of those 
skilled in the art. 
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For any compound, the therapeutically effective dose can be estimated initially either in 
cell culture assays, e.g., of neoplastic cells; or in animal models, usually mice, rabbits, dogs, or 
pigs. The animal model is also used lo achieve a desirable concentration range and route of 
administration. Such information can then be used to determine useful doses and routes for 
5 administration in humans. 

A therapeutically effective dose refers to that amount of protein or its antibodies, 
antagonists, or inhibitors that ameliorate the symptoms or condition. Therapeutic efficacy and 
toxicity of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., ED50 (the dose therapeutically effective in 50% of the 

10 population) and LD50 (the dose lethal to 50% of the population). The dose ratio between 
therapeutic and toxic effects is the therapeutic index, and expressed as the ratio LD50/ED50. 
Pharmaceutical compositions that exhibit large therapeutic indices are preferred. The data 
obtained from cell culture assays and animal studies is used in formulating a range of dosage for 
human use. The dosage of such compounds lies preferably within a range of circulating 

15 concentrations that include the ED50 with little or no toxicity. The actual dosage can vary within 
this range depending upon, for example, the dosage form employed, sensitivity of the patient, and 
the route of administration. 

The exact dosage is chosen by the individual physician in view of the patient to be 
treated. Dosage and administration are adjusted to provide sufficient levels of the active moiety 

20 or to maintain the desired effect. Additional factors that may be taken into account include the 
severity of the disease state, e.g., tumor size and location; age, weight and gender of the patient; 
diet, time and frequency of administration: drug combination(s); reaction sensitivities; and 
tolerance/response to therapy. Long-acting pharmaceutical compositions can be administered 
every 3 to 4 days, every week, or once every two weeks depending on half-life and clearance rate 

25 of the particular formulation. 

Normal dosage amounts may vary from 0. 1 to 1 00,000 micrograms, up to a total dose of 
about 1 g, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art. 
Those skilled in the art will employ different formulations for nucleotides than for proteins or 

3 0 their inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to 
particular cells, conditions, locations, etc. 

It is contemplated, for example, that hSBP or an hSBP derivative can be delivered in a 

.34. 



wo 98/21331 



PCT/US97/20674 



suitable formulation to block the progression of breast cancer. Similarly, administration of hSBP 
antagonists may also inhibit the activity or shorten the lifespan of this protein. 

The examples below are provided to illustrate the subject invention and are not included 
for the purpose of limiting the invention. 

INDUSTRIAL APPLICABILITY 
1. Construction of BRSTTUTOl cDNA Libraries 

The BRSTTUTOl cDNA library was constructed from breast tumor removed from a 55 
year old female (lot #0005; Mayo Clinic. Rochester N4N). The frozen tissue was immediately 
homogenized and lysed using a Brinkmann Homogenizer Polytron-PT 3000 (Brinkmann 
Instruments, Inc. Westbury NY) in guanidinium isothiocyanate solution. Lysates were then 
loaded on a 5.7 M CsCl cushion and ultracentrifuged in a SW28 swinging bucket rotor for 18 
hours at 25.000 rpm at ambient temperature. The RNA was extracted once with acid phenol at 
pH 4.0 and once with phenol chloroform at pH 8.0 and precipitated using 0.3 M sodium acetate 
and 2.5 volumes of ethanoL resuspended in DEPC-treated water and DNase treated for 25 min at 
37°. The reaction was stopped with an equal volume of acid phenol, and the RNA was isolated 
using the Qiagen Oligoiex kit (QIAGEN Inc, Chatsworth CA) and used to construct the cDNA 
library. The RNA was handled according to the recommended protocols in tlie Superscript 
Plasmid System for cDNA Synthesis and Plasmid Cloning (catalog #18248-013; Gibco/BRL). 
cDNAs were fractionated on a Sepharose CL4B column (catalog #275105, Pharmacia), and those 
cDNAs exceeding 400 bp were ligated into pSport I. The plasmid pSport I was subsequently 
transfomied into DH5a(tm) competent cells (Cat. #18258-012, Gibco/BRL). 
IL Isolation and Sequencing of cDNA Clones From BRSTTUTOl 

Plasmid DNA was released from the cells and purified using the Miniprep Kit (Catalogue 
# 77468; Advanced Genetic Technologies Corporation, Gaithersburg MD). This kit consists of a 
96 well block with reagents for 960 purifications. The recommended protocol was employed 
except for the following changes: I) the 96 wells were each filled with only 1 ml of sterile 
Terrific Broth (Catalog # 2271 1, LIFE TECHNOLOGIES(tm), Gaithersburg MD) with 
carbenicillin at 25 mg/L and glycerol at 0.4%; 2) the bacteria were cultured for 24 hours after the 
wells were inoculated and then lysed with 60 ^1 of lysis buffer; 3) a centrifugation step 
employing the Beckman GS-6R @2900 rpm for 5 min was performed before the contents of the 
block were added to the primary filter plate; and 4) the optional step of adding isopropanol to 
TRIS buffer was not routinely performed. After the last step in the protocol, samples were 
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transferred to a Beckman 96-well block for storage. 

The cDNAs were sequenced by the method of Sanger F and AR Coulson (1975; J Mol 
Biol 94:441 f), using a Hamilton Micro Lab 2200 (Hamilton, Reno NV) in combination with four 
Peltier Thermal Cyclers (PTC200 from MJ Research, Watertown MA) and Applied Biosystems 
377 or 373 DNA Sequencing Systems (Perkin Elmer), and reading frame was determined. 
III. Homolog}' Searching of cDNA Clones and Their DeducedProteins 

Each cDNA was compared to sequences in GenBank using a search algorithm developed 
by Applied Biosystems and incorporated into the INHERIT""* 670 Sequence Analysis System. In 
this algorithm. Pattern Specification Language (TRW Inc, Los Angeles CA) was used to 
determine regions of homology. The three parameters that determine how the sequence 
comparisons run were window size, window offset, and error tolerance. Using a combination of 
these three parameters, the DNA database was searched for sequences containing regions of 
homology to the query sequence, and the appropriate sequences were scored with an initial value. 
Subsequently, these homologous regions were examined using dot matrix homology plots to 
distinguish regions of homology from chance matches. Smith- Waterman alignments were used 
to display the results of the homology search. 

Peptide and protein sequence homologies were ascertained using the INHERIT- 670 
Sequence Analysis System in a way similar to that used in DNA sequence homologies. Pattern 
Specification Language and parameter windows were used to search protein databases for 
sequences containing regions of homology which were scored with an initial value. Dot-matrix 
homology plots were examined to distinguish regions of significant homology from chance 
matches. 

BLAST, which stands for Basic Local Alignment Search Tool (Altschul SF (1993) J Mol 
Evol 36:290-300; Altschul. SF et al (1990) J Mol Biol 215:403-10), was used to search for local 
sequence aligrmients. BLAST produces alignments of both nucleotide and amino acid sequences 
to determine sequence similarity. Because of the local nature of the alignments, BLAST is 
especially useful in determining exact matches or in identifying homologs. BLAST is useful for 
matches that do not contain gaps. The fundamental imit of BLAST algorithm output is the 
High-scoring Segment Pair (HSP). 

An HSP consists of two sequence fragments of arbitrary but equal lengths whose 
alignment is locally maximal and for which the alignment iscore meets or exceeds a threshold or 
cutoff score set by the user. The BLAST approach identifies HSPs between a query sequence 
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and a database sequence, evaluates the statistical significance of any matches found, and reports 
only those matches which satisfy the user-selected threshold of significance. The parameter E 
establishes the statistically significant threshold for reporting database sequence matches. E is 
interpreted as the upper bound of the expected frequency of chance occurrence of an HSP (or set 
5 of HSPs) within the context of the entire database search. Any database sequence whose match 
satisfies E is reported in the program output. 

IV, Northern Analysis 

Northern analysis, a laboratory technique used to detect the presence of a gene transcript, 
and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs 
10 from a particular cell type or tissue have been bound (Sam brook et al. supra). 

Analogous computer techniques using BLAST (Altschul SF 1993 and 1990, supra) are 
used to search for identical or related molecules in nucleotide databases such as GenBank or the 
LIFESEQ™ database (Incyte, Palo Alto CA). This analysis is much faster than multiple, 
membrane-based hybridizations. In addition, the sensitivity of the computer search can be 
15 modified to determine whether any particular match is categorized as exact or homologous. 
The basis of the search is the product score which is defined as: 

% sequence identity x % maximum BLAST score 
100 

The product score takes into account both the degree of similarity between two sequences and the 
20 length of the sequence match. For example, with a product score of 40, the match will be exact 
within a 1-2% error; and at 70, the match will be exact. Homologous molecules are usually 
identified by selecting those which show product scores between 15 and 40, although lower 
scores can identify related molecules. The abundance data (Abun) represent the number of 
transcripts of the gene of interest in the cDNA library. Percent abundance is calculated by 
25 dividing the number of transcripts of a gene of interest present in a cDNA library by the total 
number of transcripts in the cDNA library. 

V. Extension of hSBP-Encoding Polynucleotides to FuULength or to Recover 
Regulatory Elements 

.Full length hSBP-encoding nucleic acid sequences (SEQ ID N0:2, SEQ ID N0:4, or SEQ 
30 ID N0:6) are used to design oligonucleotide primers for extending a partial nucleotide sequence 
to full length and/or for obtaining 5' sequences from genomic libraries. One synthesized primer 
is used to initiate extension in the antisense direction (XLR), and a second synthesized primer is 
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used to extend sequence in the sense direction (XLF). Prinners allow the extension of the known 
hSBP-encoding sequence ''outward" generating amplicons containing new. unknown nucleotide 
sequence for the region of interest (U.S. Patent Application 08/487,1 12, filed June 7, 1995, 
specifically incorporated by reference). The initial primers are designed from the cDNA using 
OLIGO® 4.06 Primer Analysis Software (National Biosciences), or another appropriate program. 
The initial primers are preferable designed to be 22-30 nucleotides in length, have a GC content 
of 50% or more, and to anneal to the target sequence at temperatures about 68°-72° C. Any 
stretch of nucleotides that would result in hairpin structures and primer-primer dimerizations is 
avoided. 

The originaL selected cDNA libraries, or a human genomic library, are used to extend the 
sequence: the latter is most useful to obtain 5* upstream regions. If more extension is necessary 
or desired, additional sets of primers are designed to further extend the known region. 

By following the instructions for the XL-PCR kit (Perkin Elmer) and thoroughly mixing 
the enzyme and reaction mix. high fidelity amplification is obtained. Beginning with 40 pmol of 
each primer and the recommended concentrations of all other components of the kit, PCR is 
performed using the Peltier Thermal Cycler (PTC200; MJ Research, Watertown MA) and the 
following parameters: 



Step 1 94° C for 1 min (initial denaturation) 

Step 2 65° C for I min 

Step 3 68° C for 6 min 

Step 4 94° C for 15 sec 

Step 5 65° C for 1 min 

Step 6 68° C for 7 min 

Step 7 Repeat step 4-6 for 1 5 additional cycles 

Steps 94° C for 15 sec 

Step 9 65° C fori min 

Step 10 68° C for 7:15 min 

Step 11 Repeat step 8-10 for 12 cycles 

Step 12 72° C for 8 min 

Step 13 4° C (and holding) 



A 5-10 al aliquot of the reaction mixture is analyzed by electrophoresis on a low 
concentration (about 0.6-0.8%) agarose mini-gel to determine which reactions were successful in 
extending the sequence. Bands containing the largest products were selected and cut out of the 
gel. Further purification is accomplished using a commercial gel extraction method such as 
QlAQuick*^*^ (QIAGEN Inc). After recovery of the DNA, Klenow enzyme was used to trim 
single-stranded, nucleotide overhangs creating blunt ends to facilitate religation and cloning. 
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After ethanol precipitation, the products are redissolved in 13 fx\ of ligation buffer. 1 fx\ 
T4-DNA ligase (15 units) and 1 m1 T4 polynucleotide kinase are added, and the mixture is 
incubated at room temperature for 2-3 hours or overnight at 16"* C. Competent R coli cells (in 
40 fA of appropriate media) are transformed with 3 fxl of ligation mixture and cultured in 80 ^\ of 
SOC medium (Sambrook J et al. supra). After incubation for one hour at 37° C, the whole 
transformation mixture is plated on Luria Bertani (LB)-agar (Sambrook J et al, supra) containing 
2xCarb. The following day, several colonies are randomly picked from each plate and cultured in 
150 Ail of liquid LB/2xCarb medium placed in an individual well of an appropriate, 
commercially-available, sterile 96-well microtiter plate. The following day, 5 ^\ of each 
overnight culture is transferred into a non-sterile 96-well plate and after dilution 1:10 with water, 
5 mI of each sample was transferred into a PGR array. 

For PGR amplification, 18 ul of concentrated PGR reaction mix (3.3x) containing 4 units 
of rTth DNA polymerase, a vector primer and one or both of the gene specific primers used for 
the extension reaction were added to each well. Amplification was performed using the 
following conditions: 



Step I 94° G for 60 sec 

Step 2 94° G for 20 sec 

Step 3 55° G for 30 sec 

Step 4 72° G for 90 sec 

Step 5 Repeat steps 2-4 for an additional 29 
cycles 

Step 6 72° G for 180 sec 

Step 7 4° G (and holding) 



Aliquots of the PGR reactions are run on agarose gels together with molecular weight 
markers. The sizes of the PGR products were compared to the original partial cDNAs. and 
appropriate clones were selected, ligated into plasmid and sequenced. 
VL Labeling and Use of Hybridization Probes 

Hybridization probes derived from SEQ ID N0:2 and SEQ ID N0:4 are used to screen 
cDNAs, genomic DNAs or mRNAs. Although the labeling of oligonucleotides, consisting of 
about 20 base-pairs, is specifically described, essentially the same procedure is used with larger 
cDNA fragments. Oligonucleotides are designed using state-of-the-art software such as OLIGO 
4.06 (National Biosciences), labeled by combining 50 pmol of each oligomer and 250 mCi of 
[y-^-P] adenosine triphosphate (Amersham, Ghicago IL) and T4 polynucleotide kinase (DuPont 
NEN*. Boston MA). The labeled oligonucleotides are substantially purified with Sephadex G-25 
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super fine resin column (Pharmacia). A portion containing 10' counts per minute of each of the 
sense and antisense oligonucleotides is used in a typical membrane based hybridization analysis 
of human genomic DNA digested with one of the following endonucleases (Ase I. Bgl II, Eco RI, 
Pst I, Xba L or Pvu II; DuPont NEN*), 

The DNA from each digest is fractionated on a 0.7 percent agarose gel and transferred to 
nylon membranes (Nytran Plus, Schleicher & SchuelL Durham NH). Hybridization is carried out 
for 16 hours at 40''C. To remove nonspecific signals, blots are sequentially washed at room 
temperature under increasingly stringent conditions up to 0. 1 x saline sodium citrate and 0.5% 
sodium dodecyl sulfate. After XOMAT AR'^^ f[\^ (Kodak. Rochester NY) is exposed to the 
blots in a Phosphoimager cassette (Molecular Dynamics. Sunnyvale CA) for several hours, 
hybridization patterns are compared visually. 
Vll. Antisense Molecules 

An hSBP polypeptide-encoding sequence (which sequences encompass full length and 
partial hSBP sequences), is used to inhibit in vivo or in vitro expression of naturally occurring 
hSBP. Although use of antisense oligonucleotides, comprising about 20 base-pairs, is 
specifically described, essentially the same procedure is used with larger cDNA fragments. An 
oligonucleotide based on the coding sequences of hSBP. as shown in Figures 1 A and IB and 2A 
and 2B is used to inhibit expression of naturally occurring hSBP. The complementary 
oligonucleotide is designed from the most unique 5' sequence as shown in Figures 1 A and IB and 
2A and 2B and used either to inhibit transcription by preventing promoter binding to the 
upstream nontranslated sequence or translation of an hSBP-encoding transcript by preventing the 
ribosome from binding. Using an appropriate portion of the leader and 5' sequence of SEQ ID 
N0:2 or SEQ ID N0:4, an effective antisense oligonucleotide includes any 15-20 nucleotides 
spanning the region which translates into the signal or early coding sequence of the polypeptide 
as shown in Figures lA and IB, and 2A and 2B. 
VllL Expression of HSBP 

Expression of the hSBP is accomplished by subcloning the cDN As into appropriate 
vectors and transfecting the vectors into host cells. In this case, the cloning vector. pSport, 
previously used for the generation of the cDNA library is used to express hSBP polypeptides in 
E. coH. The pSport vector contains a promoter for B-galactosidase upstream of the cloning site, 
followed by a sequence encoding the amino-terminal Met and the subsequent 7 residues of 
li-galactosidase. Sequences encoding a bacteriophage promoter useful for transcription and a 
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linker containing a number of unique restriction sites are positioned immediately after the eight 
B-galactosidase residue-encoding sequences. 

IPTG is used to induce production of the fusion protein in an isolated, transfected 
bacterial strain according to standard methods. The fusion protein comprises the first seven 
residues of B-galactosidase, about 5 to 15 residues of linker, and the ftill length hSBP-encoding 
sequence. The signal sequence directs the secretion of hSBP polypeptide into the bacterial 
growth media, which can then be used directly in the following activity assay. 

IX. hSBP Activity 

Given the homology of hSBP with rat prostatic binding protein (rPBP), human 
mammaglobin. rabbit uteroglobin, and FHG 22, activity of hSBP can be assessed by the ability of 
the polypeptide to bind to steroid. Methods for assessing steroid binding to a polypeptide are 
well known in the art (see. e.g., Heyns et al. 1977 Eur J Biochem 78:221-230). Alternatively, 
given the homology between hSBP and rPBP, and the similarities between rPBP and estramucine 
binding protein (EMBP), hSBP activity can be assessed by the ability of hSBP to bind 
estrmucine. Methods for assessing estramucine binding are well known in the art (see, e.g., 
Appelgren et al. 1979 Acta Pharmacol Toxicol 43:368-374; Forsgren et al. 1979 Cancer Res 
39:5155-5164: Hoisaeteret al. 198! J Steroid Biochem 14:251-160). 

X. Production of hSBP Specific Antibodies 

hSBP polypeptide substantially purified using PAGE electrophoresis (Sambrook. supra) 
is used to immunize rabbits and to produce antibodies using standard protocols. The amino acid 
sequence translated from hSBP is analyzed using DNAStar software (DNAStar Inc) to determine 
regions of high immunogenicity, and a corresponding oligopolypeptide is synthesized and used to 
produce antibodies according to methods known to those of skill in the art. Analysis to select 
appropriate epitopes, such as those near the C-terminus or in hydrophilic regions is described by 
Ausubel et al (supra). 

Typically, antibodies are generated using polypeptides about 15 residues in length, 
which are synthesized on an Applied Biosystems Peptide Synthesizer Model 431 A using fmoc- 
chemisir\', and coupled to keyhole limpet hemocyanin (KLH, Sigma) by reaction with M- 
maleimidobenzoyl-N-hydroxysuccinimide ester (MBS; Ausubel et al, supra). Rabbits are 
immunized with the polypeptide-KLH complex in complete Freund's adjuvant. The resulting 
antisera are tested for anti-polypeptide activity by, for example, binding the peptide to plastic, 
blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radioiodinated, 
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goat anti-rabbit IgG. 

XI. Purification of Naturally Occurring hSBP Using Specific Antibodies 

Naturally-occurring or recombinant hSBP is substantially purified by immunoaffinity 
chromatography using antibodies specific for hSBP. An immunoaffinity column is constructed 
by covalently coupling anti-hSBP antibody to an activated chromatographic resin such as 
CnBr-aciivated Sepharose (Pharmacia Biotech). After coupling, the resin is blocked and washed 
according to the manufacturer's instructions. 

Media containing hSBP polypeptide is passed over the immunoaffinity column, and the 
column is washed under conditions that allow the preferential absorbance of hSBP (e.g., high 
ionic strength buffers in the presence of detergent). The column is eluted under conditions that 
disrupt antibody-hSBP binding (e.g., a buffer of pH 2-3 or a high concentration of a chaotrope 
such as urea or thiocyanate ion), and hSBP polypeptide is collected. 

XII. Identification of Molecules Which Interact with hSBP 

hSBP polypeptides, especially biologically active hSBP polypeptides, are labeled with 
'--'I Bolton-Hunter reagent (Bolton and Hunter (1973) Biochem J 133:529). Candidate molecules 
previously arrayed in the wells of a 96 well plate are incubated with the labeled hSBP 
polypeptides, washed, and assayed for labeled hSBP complex. Data obtained using different 
concentrations of hSBP are used to calculate values for the number, affinity, and association of 
hSBP with the candidate molecules. 

All publications and patents mentioned in the above specification are herein incorporated 
by reference. Various modifications and variations of the described method and system of the 
invention will be apparent to those skilled in the art without departing from the scope and spirit 
of the invention. Although the invention has been described in connection with specific preferred 
embodiments, it should be understood that the invention as claimed should not be unduly limited 
to such specific embodiments. Indeed, various modifications of the described modes for carrying 
out the invention which are obvious to those skilled in molecular biology or related fields are 
intended to be within the scope of the following claims. 

Before the present nucleotide and polypeptide sequences are described, it is to be 
understood that this invention is not limited to the particular methodology, protocols, cell lines, 
vectors and reagents described as such may, of course, vary. It is also to be understood that the 
terminology used herein is for the purpose of describing particular embodiments only, and is not 
intended to limit the scope of the present invention which will be limited only by the appended 
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claims. 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"and", and "the" include plural referents unless the context clearly dictates otherwise. Thus, for 
example, reference to "a host cell" includes a plurality of such host cells and reference to "the 
antibody" includes reference to one or more antibodies and equivalents thereof known to those 
skilled in the art, and so forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meaning as commonly understood to one of ordinary skill in the art to which this invention 
belongs. Although any methods, devices and materials similar or equivalent to those described 
herein can be used in the practice or testing of the invention, the preferred methods, devices and 
materials are now described. 

All publications mentioned herein are incorporated herein by reference for the purpose of 
describing and disclosing the cell lines, vectors, and methodologies which are described in the 
publications which might be used in connection with the presently described invention. The 
publications discussed herein are provided solely for their disclosure prior to the filing date of the 
present application. Nothing herein is to be construed as an admission that the inventors are not 
entitled to antedate such disclosure by virtue of prior invention. 
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SEQUENCE LISTING 



(1) GENERAL IjiFORMATION : 



(i) APPLICANT: INCYTE PHARMACEUTICALS, INC. 



(ii) TITLE OF INVENTION: BREAST TUMOR SPECIFIC PROTEINS 
(iii) NUMBER OF SEQUENCES: 13 

(iv) CORRESPONDEMCE ADDRESS: 

(A) ADDRESSEE: INCYTE PHARMACEUTICALS, INC, 

(B) STREET: 3174 Porter Drive 

(C) CITY: Palo Aito 

(D) STATE: CA 

(E) COUNTRY: USA 
(F; ZIP: 94304 

(y) COMP'JTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 
(3) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: Paten.tln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) PCT APPLICATION NUMBER: To Be Assigned 

(B) FILING DATE: Herewith 
{C; CLASSIFICATION: 

(vii) CURRENT APPLICATION DATA: 

(A:- APPLIC.nTION NUMBER: US 08/747,547 
(B) FILING DATE: 12-NOV-1996 

(viii) ATTORNEY /AGENT INFORMATION: 
:a; NAME: Billings, Lucy J. 
(3; REGISTRATION NUMBER: 36,74 9 
;c; REFERE:;C£/ DOCKET number: PF-0C77 PCT 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (650) 855-0555 
(3) TELEFAX: (650) 845-4166 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 90 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii} MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 



Met Lys Leu Ser Val Cys Leu Leu Leu Val Thr Leu Ala Leu Cys Cys 
15 10 15 
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Tyr Gin Ala Asn Ala Glu Phe Cys Pro Ala Leu Val Ser Giu Leu Leu 
20 25 30 

AsD Phe Phe Phe lie Ser Glu Pro Leu Phe Lys Leu Ser Leu Ala Lys 
35 40 45 

Phe AsD Ala Pro Pro Glu Ala Val Ala Ala Lys Leu Gly Val Lys Arg 
50" 55 60 

Cys Thr Asd Gin Met Ser Leu Gin Lys Arg Ser Leu lie Ala Glu Val 
65 . * 70 75 80 

Leu Val Lys lie Leu Lys Lys Cys Ser Val 
85 90 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

'(A; LENGTH: 405 base pairs 
(3; TYPE: nucleic acid 
(C; 3TRANDSDNESS : double 
(D; TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

GTCCAAATCA CTCATTGTTT GTGAAAGCTG AGCTCACAGC AAAACAAGCC ACC ATG 56 

Met 
1 



AAG 


CTG 


TCG 


GTG 


TGT 


CTC 


CTG 


CTG 


G7C 


ACG 


CTG 


GCC 


CTC 


TGC 


TGC 


TAC 


104 


Lys 


Leu 


Ser 


Val 


Cys 


Leu 


Leu 


Leu 


Val 


Thr 


Leu 


Ala 


Leu 


Cys 


Cys. 


Tyr 










5 










10 










15 








CAG 


GCC 


AAT 


GCC 


GAG 


TTC 


TGC 


CCA 


GCT 


CTT 


GTT 


TCT 


GAG 


v-TG 


TTA 


GAC 


152 


Gin 


Ala 


Asn 


A.la 


Glu 


Phe 


Cys 


Pro 


Ala 


Leu 


Val 


Ser 


Glu 


Leu 


Leu 


Asp 








20 










25 










30 










7TC 




TTC 


ACT 


AGT 


GAA 


CCT 


CTG 


TTC 


AAG 


TTA 


AGT 


CTT 


GCC 


AAA 


TTT 


200 


Phe 


Phe 


Phe 


lie 


Ser 


Glu 


Pro 


Leu 


Phe 


Lys 


Leu 


Ser 


Leu 


Ala 


Lys 


Phe 






35 










40 










45 












GAT 


GCC 


CCT 


CCG 


GAA 


GCT 


GTT 


GCA 


GCC 


AAG 


TTA 


GGA 


GTG 


AAG 


AGA 


TGC 


248 


Asp 


Ala 


Pro 


Pro 


Glu 


Ala 


Val 


Ala 


Ala 


Lys 


Leu 


Gly 


Val 


Lys 


Arg 


Cys 




50 










55 










60 










65 




ACG 


GAT 


CAG 


ATG 


TCC 


CTT 


CAG 


AAA 


CGA 


AGC 


CTC 


ATT 


GCG 


GAA 


GTC 


CTG 


296 


Thr 


Asp Gin 


Met 


Ser 


Leu 


Gin 


Lys 


Arg 


Ser 


Leu 


He 


Ala 


Glu 


Val 


Leu 












70 










75 










80 






GTG 


AAA 


ATA 


TTG 


AAG 


AAA 


TGT 


AGT 


GTG 


TGA 


CATGTAAAAA CTTTCATCCT 


346 


Val 


Lys 


He 


Leu 


Lys 


Lys 


Cys 


Ser 


Val 


♦ 






















35 










90 



















GGTTTCCACT GTCTTTCAAT GACACCCTGA TCTTCACTGC AGAATGTAAA GGTTTCAAC 405 
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(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

•:A) LENGTH: 93 amino acids 
;3) TYPE: aiaino acid 
(C) STRANDEDNESS: double 
-;D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) . SEQUENCE DESCRIPTION: SEQ ID N0:3: 

Met Lvs Leu Leu Met Val Leu Met Leu Ala Ala Leu Ser Gin His Cys 
1*5 10 15 

Tyr Ala Gly Ser Gly Cys Pro Leu Leu Glu Asn Val lie Ser Lys Thr 
20 25 30 

lie Asn Pro Gin Val Ser Lys Thr Glu Tyr Lvs Glu Leu Leu Gin Glu 
35 . 40 '45 

Phe lie Asp Asp Asn Ala Thr Thr Asn Ala He Asp Glu Leu Lys Glu 
50 " 55 60 

Cys ?r.e Leu Asn Gin Thr Asp Glu Thr Leu Ser Asn Val Glu Val Phe 
65 "70. 75 80 

Met Gin Leu He Tyr Asp Ser Ser Leu Cys Asp Leu Phe 
85 90 



(2) INFORM^vTXON FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

:A) LENGTH: 495 base pairs 
:31 TYPE: nucleic acid 
:c; STRANDEDNESS: double 
:Z] TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GATCCTTGCC ACCCGCGACT GAACACCGAC AGCAGCAGCC TCACC ATG AAG TTG 

Met Lys Leu 



54 



CTG ATG GTC C7C ATG CTG GCG GCC CTC TCC CAG CAC TGC TAC GCA GGC 102 
Leu Met Val Leu Met Leu Ala Ala Leu Ser Gin His Cys Tyr Ala Gly 
5 10 15 

TCT GGC TGC CCC TTA TTG GAG AAT GTG ATT TCC AAG ACA ATC AAT CCA 150 
Ser Gly Cys Pro Leu Leu Glu Asn Val He Ser Lys Thr He Asn Pro 
20 25 30 35 

CAA GTG TCT mG ACT GAA TAC AAA GAA CTT CTT CAA GAG TTC ATA GAC 198 
Gin Val Ser Lys Thr Glu Tyr Lys Glu Leu Leu Gin Glu Phe He Aso 
40 45 50 
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GAC AAT GCC ACT AC A AAT GCC ATA GAT 
Asp Asn Ala Thr Thr Asn Ala lie Asd 
55 60 

.-J\C CAA ACG G.-.T GAA ACT CTG AGC AAT 
Asn Gin Thr Asd Glu Thr Leu Ser Asn 
70 * 75 

ATA TAT GAC A3C AGT CTT TGT GAT TTA 
lie Tyr Asd Ser Ser Leu Cys Asp Leu 
85 ' .90 

CTG GCT CAC AGA ACT GCA GGG TAT GGT 

TGC TUVA CCA CAC CTT CTC TTT CTT ATG 

GAC AAT TGT TGA AAC CTG CTA TAC ATG 

CAA AAA CTG 



GAA TTG AAG GAA TGT TTT CTT 24 6 

Glu Leu Lys Glu Cys Phe Leu 
65 

GTT GAG GTG TTT ATG CAA TTA 294 
Val Glu Val Phe Met Gin Leu 
80 

TTT TAA CTT TCT GCA AGA CCT 342 
Phe * 



GAG AAA CCA ACT ACG GAT TGC 390 
TCT TTT TAC TAC AAA CTA CAA 4 38 

TTT ATT TTA ATA AAT TGA TGG 486 

495 



:2) INFORMATICS FOR S£Q ID NO: 5: 

(i) SEQUIMCE CHARACTERISTICS: 

(A; LENGTH: 111 amino acids 
(3; TYPE: amino acid 
(C; STRANDEDNESS: double 
iO) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Ser Thr lie Lys Leu Ser Leu Cvs Leu Leu lie Met Leu Ala Val 

1 5 ' 10 15 

Cys Cvs Tyr Glu Ala Asn Ala Ser Gin lie Cys Glu Leu Val Ala His 

20 25 30 

Glu Thr :ie Ser Phe Leu Met Lys Ser Glu Glu Glu Leu Lys Lys Glu 
35 40 45 

Leu Glu Met Tyr Asn Ala Pro Pro Ala Ala Val Glu Ala Lys Leu Glu 
50 55 60 

Val Lys Arg Cys Val Asp Gin Met Ser Asn Gly Asp Arg Leu- Val Val 
65 70 75 80 

Ala Glu Thr Leu Val Tyr He Phe Leu Glu Cys Gly Val Lys Gin Trp 
85 90 95 

Val Glu Thr Tyr Tyr Pro Glu He Asp Phe Tyr Tyr Asp Met Asn 
100 105 110 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 412 base pairs 
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;a) TYPE: nucleic acid 
(CI STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



CGCTAAGTAG AAAACTGAA ATG AGC ACC ATT AAG CTG AGC CTG TGT CTT CT3 52 

Met Ser Thr lie Lys Leu Ser Leu Cys Leu Leu 
15 10 



ATC 


ATG 


CTG 


GCT 


GTT 




TGC 


TAT 


GAA 


GCT 


t\r\x 








ATP 

Ml ^ 


TGT 


iUU 


lie 


Met 


Leu 


Ala 


Val 


Cys 


Cys 


Xyr 


Glu 


Ala 


Asn 


Ala 


Ser 


Gin 


He 


Cys 










15 










20 










25 






GAA 


CTT 


GT? 


GCC 


CAT 


GAA 


ACC 


ATA 


AGC 




TTA 


ATG 


AAA 


AGT 


GAG 


GAA 


148 


Glu 


Leu 


Val 


Ala 


His 


Glu 


Thr 


He 


Ser 


Phe 


Leu 


Met 


Lys 


Ser 


Glu 


Glu 








30 










35 










40 










GAA 


i 'O 


AAG 


AAG 


GAA 


CTT 


GAG 


ATG 


TAT 


AAT 


GCA 


CCT 


CCA 


GCA 


GCT 


GTT 


196 


Glu 


Leu 


Lys 


Lys 


Glu 


Leu 


Glu 


Met 


i yr 


Asn 


Ala 


Pro 


Pro 


Ala 


Ala 


Vai 






45 










50 










55 












GAA 


GCA 


AAA 


CTG 


GAA 


GTG 


AAG 


AGA 


TGT 


GTA 


GAC 


CAG 


ATG 


AGC 


AAT 


GGA 


244 


Glu 


Ala 


Lys 


Leu 


Glu 


Val 


Lys 


Arg 


Cys 


Val 


Asp 


Gin 


Met 


Ser 


Asn 


Gly 




60 










65 










70 










75 




GAG 


AGA 


TTG 


GTA 


GTA 


GCA 


GAA 


ACA 


CTG 


GTA 


TAG 


ATT 


TTT 




GAA 


TGT 


292 


Asp 


Arg 


Leu 


Val 


Val 


Ala 


Glu 


Thr 


Leu 


Val 


Tyr 


He 


Phe 


Leu 


Glu 


Cys 












80 










85 










90 




GGT 


GTG 


AAA 


CAA 


TGG 


GTA 


GAA 


ACA 


TAT 


TAT 


CCT 


GAG 


ATC 


GAT 


TTC 


TAG 


340 


Gly 


Val 


Lys 


Gin 


Trp 


Val 


Glu 


Thr 


Tyr 


Tyr 


Pro 


Glu 


He 


ASD 


Phe 


Tyr 










95 










100 










105 








TAG 


GAT 


ATG 


AAC 


TGA 


TTT 


TTC 


CTG 


— 

1 . w 


AAT 


GTG 


ATG 


GTT 


TCA 


AGT 




388 


Tyr 


Asp 


Met 


Asn 


■k 






























110 






























GCA 






AAT 


TAT 






TGi.- 


















412 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 40 base pairs 

(B) TYPE: nucleic acid 
;C) STRANDEDNESS: double 
;D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 



GCTCATCCTT TGCTAAGTCT GAAAACAAAC TGAGCACCAT GAAGCTGTCC CTGTGTCTTC 60 

TGTTGGTCAT CCTGGCTGTT CATTGCTATG AAGCTAATGC TGCAAACGTC TGTCCAGCAG 120 

TTCTTTCTGT AAGCAAATCT TTCCTATTTG ACAAGGTGGA GAAATTTGAG GCCTATCTTC 180 
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AGACATTTAA CGCACCTCCA GAGGCTGT7A AAGCAAAAGT GGAAG7GAAG AAATG7ATAG 24 0 

ACAGCACTCT GAACTATTTA GAGAAAATGG AAATGGGAAA AATACTGGCA GAAGTCGTTG 300 

GGTATTG7AA AGGAACAGAA AACTGAAACA TGGCTC7TCC TG3TC7CCA7 TGCT7CTCAC 360 

AGATAAAC7G AC7TTCCT7G CCCAATGTGA AGGTT7CAAC G7C7T3CAC7 AATAAATTAC 420 

TC7CC7TGCA TGTTAAAAAA 440 



(2) INFORMATION FOR SEQ ID NO: 8: . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 112 amino acids 

(B) TYPE: aniino acid 

( D) TOPOLOGY : unknown 

Cii) MOLECULE TYPE: protein 



(xi; SEQUENCE DESCRIPTION: SEQ ID MO: 8: 

Met Arg Leu Ser Leu Cys Leu Leu Thr lie Leu Vai 7al Cys Cvs Tyr 
1 5 10 15 

Glu Ala Asn Gly Gin Thr Leu Ala Gly Gin Val Cys Gin Ala Leu Gin 
20 25 30 

AsD Val Thr lie Thr Phe Leu Leu Asn Pro Giu Giu Glu Leu Lys Arg 
' ' 35 40 45 

Glu Leu Giu Glu Phe Asd Ala Pro Pro Glu Ala Val Glu Ala Asn Leu 
50 55 60 

Lys Val Lys Arg Cys lie Asn Lys He Met Tyr Glv A.sp Arg Leu Ser 
65 '70 75 ^ 80 

Met Glv Thr Ser Leu Val Phe lie Me" Leu Lvs Cvs Asp Val Lvs Vai 
85 90 ' ' 95 

Tro Leu Gin He Asn Phe Pro Arg Glv Arg Tro Phe Ser Glu He Asn 
100 105 ' 110 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 91 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Lys Leu Ala He Thr Leu Ala Leu Val Thr Leu Ala Leu Leu Cys 
15 10 15 
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Ser Pro Ala Ser Ala Gly lie Cys Pro Arg Phe Ala His Val lie Glu 
20 25 30 

Asn Leu Leu Leu Gly Thr Pro Ser Ser Tyr Glu Thr Ser Leu Lys Glu 
35 40 45 

Phe Glu Pro Asp Aso Thr Met Lys Aso Ala Gly Met Gin Mec Lys Lys 
50 55 ' 60 

Val Leu Asp Ser Leu Pro Gin Thr Thr Arg Glu Asn lie Me-: Lys Leu 
65 70 75 80 

Thr Glu Lvs lie Val Lys Ser Pro Leu Cys Met 
85 90 



;2) INFORMATION FOR SEQ ID NO: 10: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDME5S: double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

Me- Lvs Leu Leu Met Val Leu Met Leu Ala Ala Leu Ser Gin His Cys 
1 ' 5 10 15 

Tyr Ala Gly Ser Glv Cvs Pro Leu Leu Glu Asn Val lie Ser Lys Thr 
20 25 30 

lie Asn Pro Gin Val Ser Lys Thr Glu Tyr Lys Glu Leu Leu Gin Glu 
35 40 45 

Phe lie Asp Asp Asn Ala Thr Thr Asn Ala lie Asp Glu Leu Lys Glu 

50 55 60 

Cys Phe Leu Asn Gin Thr Asp Glu Thr Leu Ser Asn Val Glu Val Phe 
65 70 75 80 

Met Gin Leu lie Tyr Aso Ser Ser Leu Cvs Asp Leu Phe 
85 90 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 503 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
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GACAGCGGC7 TCCTTGATCC TTGCCACCCG CGACTGAACA CCGACAGCAG CAGCCTCACC 60 

ATG AAG TTG CTG ATG GTC CTC ATG CTG GCG GCC CTC TCC GAG CAC TGC 108 
Met Lys Leu Leu Met Val Leu Met Leu Ala Ala Leu Ser Gin His Cys 
1 5 10 15 

TAG GCA GGC TCT GGC TGC CCC TTA TTG GAG AAT GIG ATT TCC AAG ACA 156 
Tyr Ala Gly Ser Gly Cys Pro Leu Leu Glu Asn Val He Ser Lys Thr 
20 25 30 

ATC AAT CCA CAA GTG TCT AAG ACT GAA TAC AAA GAA CTT CTT CAA GAG 204 
He Asn Pro Gin Val Ser Lys Thr Glu Tyr Lys Glu Leu Leu Gin Glu 
35 40 45 

TTC ATA GAC GAC AAT GCC ACT ACA AAT GCC ATA GAT GAA TTG AAG GAA 252 
?he He Asp Asp Asn Ala Thr Thr Asn Ala He Asp Glu Leu Lys Glu 
50 ' 55 60 

TGT TTT CTT AAC CAA ACG GAT GAA ACT CTG AGC AAT GTT GAG GTG TTT 300 
Cys Phe Leu Asn Gin Thr Asd Glu Thr Leu Ser Asn Val Glu Val Phe 
65 70 75 80 

ATG CAA TTA ATA TAT GAC AGC AGT CTT TGT GAT TTA TTT TAA CTT TCT 348 
Met Gin Leu He Tyr Asp Ser Ser Leu Cys Asp Leu Phe 
85 90 

GCA AGA CCT TTG GCT CAC AGA ACT GCA GGG TAT GGT GAG AAA CCA ACT 396 

ACG GAT TGC TGC AAA CCA CAC CTT CTC TTT CTT ATG TCT TTT TAC TAC 444 

AAA CTA CAA GAC AAT TGT TGA AAC CTG CTA TAC ATG TTT ATT TTA ATA 4 92 

AAT TGA TGG CA 503 



(2) INFORt^TION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 

iC) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Lys Leu Val Phe Leu Phe Leu Leu Val Thr He Pro He Cys Cys 
15 10 15 

Tyr Ala Ser Gly Ser Gly Cys Ser He Leu Asd Glu Val He Arg Gly 
20 25 * 30 

Thr He Asn Ser Thr Val Thr Leu His Asd Tyr Met Lys Leu Val Lys 
35 40 ' 45 

Pro Tyr Val Gin Asd His Phe Thr Glu Lys Ala Val Lys Gin Phe Lys 
50 " 55 60 
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Gin Cys ?he Leu Asd Gin Thr Asp Lys Thr Leu Glu Asn Val Gly Val 

65 ' 70 75 80 

Met Met Glu Ala lie Phe Asn Ser Glu Ser Cys Gin Gin Pro Ser 
35 90 95 



(2) INFORMATION FOR SZQ ID NO: 13: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 509 base pairs 

(B) TYPE: nucleic acid 
;C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

AGTTTCCTGA TTTCTGTCTT GGACAACAGA ACAACCCACA GGGACTGCCT CAAC ATG 57 

Met 
1 

AAG CTG GTG TTT CTA TTC TTG TTG GTC ACC ATC CCT ATT TGC TGC TAT 105 
Lys Leu Val Phe Leu Phe Leu Leu Val Thr lie Pro lie Cys Cys Tyr 
5 . 10 15 

GCC AGT GGT TCT GGC TGC AGT ATT CTA GAT GAA GTT ATT AGA GGT ACA 153 
Ala Ser Gly Ser Gly Cys Ser lie Leu Asp Glu Val He Arg Gly Thr 
20 25 30 

ATT AAC TCA ACT GTG ACT TTA CAT GAC TAT ATG AAA TTA GTT AAG CCA 201 
He Asn Ser Thr Val Thr Leu His Asd Tyr Met Lys Leu Val Lys Pro 
35 40 " 45 

TAT GTA CAA GAT CAT TTT ACT GAA AAG GCT GTG AAG CAA TTC AAG CAG 24 9 

Tyr Val Gin Aso His Phe Thr Glu Lys Ala Val Lys Gin Phe Lys Gin 
50 "55 60 65 

TGT TTT CTA GAT CAG ACC GAC AAG ACT CTG GAA AAT GTT GGC GTG ATG 297 
Cys Phe Leu Aso Gin Thr Asp Lys Thr Leu Glu Asn Val Giy Val Met 
70 75 80 



ATG GAG GGA ATA TTT AAC AGT GAA AGC 
Met Glu Ala He Phe Asn Ser Glu Ser 
85 90 

TCT ACA AGA TCT TTG GCC ACA GGA CTC 

GCA ACT GAT AAC ACA GAT CAT AAC TCT 

TAG CTA TAA AGT GCA AGA CGA TTG TTG 

CCA TTT TAT TAA ATT ATC TG 



TGT CAA CAG CCA TCC TAA ACA 34 5 
Cys Gin Gin Pro Ser * 
95 

CAG GAA ACT GGC AAT GGC CAA 393 

TCT TTC TTG AAC CCC TTT TTC 441 

AAA CCT CAA ATT TAT GTC TTT 489 

509 
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CLAIMS 

1 . A substantially purified human steroid binding protein C 1 (hSBPl ) polypeptide 
comprising the amino acid sequence of SEQ ID N0:1 or fragments thereof 

2. An isolated and purified polynucleotide sequence encoding an hSBP 1 polypeptide of 
claim 1. 

3. An isolated and purified polynucleotide sequence of claim 2 consisting of SEQ ID N0:2 
or variants thereof 

4. A polynucleotide sequence which is complementary to SEQ ID N0:2 or degenerate 
variants thereof 

5. A recombinant expression vector comprising the polynucleotide sequence of claim 2. 

6. A recombinant host cell containing the polynucleotide sequence of claim 5. 

7. A method for producing a polypeptide comprising a polypeptide of SEQ ID NO: 1 . the 
method comprising the steps of 

a) culturing the host cell of claim 6 under conditions suitable for the expression of the 

polypeptide: and 

b) recovering the polypeptide from the host cell culture. 

8. A pharmaceutical composition comprising a substantially purified hSBP polypeptide 
having an amino acid sequence of SEQ ID NO: 1 in conjunction with a suitable pharmaceutical 
carrier. 

9. A purified antibody that specifically binds the polypeptide of claim 1 . 

10. A purified antagonist which specifically regulates or modulates the activity of the 
polypeptide of claim 1. 

11. A pharmaceutical composition comprising a substantially purified antagonist of the 
polypeptide of claim 1 in conjunction with a suitable pharmaceutical carrier. 

12. A substantially purified human steroid binding protein C2 (hSBP2) polypeptide 
comprising the amino acid sequence of SEQ ID N0:3 or fragments thereof 

13. An isolated and purified polynucleotide sequence encoding an hSBP2 polypeptide of 
claim 12. 

14. An isolated and purified polynucleotide sequence of claim 1 3 consisting of SEQ ID 
N0:4 or variants thereof 

15. A polynucleotide sequence which is complementary to SEQ ID N0:4 or degenerate 
variants thereof 
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16. A recombinant expression vector comprising the polynucleotide sequence of claim 13, 

17. A recombinant host cell containing the polynucleotide sequence of claim 13. 

1 8. A method for producing a polypeptide comprising a polypeptide of SEQ ID N0:3, the 
method comprising the steps of: 

5 a) culturing the host cell of claim 1 7 under conditions suitable for the expression of the 

polypeptide; and 

b) recovering the polypeptide from the host cell culture. 

19. A pharmaceutical composition comprising a substantially purified human steroid binding 
protein C2 (hSBP2) polypeptide having an amino acid sequence of SEQ ID N0:3 in conjunction 

10 with a suitable pharmaceutical carrier. 

20. A purified antibody that specifically binds the polypeptide of claim 12. 

21. A purified antagonist which specifically regulates or modulates the activity of the 
polypeptide of claim 12. 

22. A pharmaceutical composition comprising a substantially purified antagonist of the 
15 polypeptide of claim 12 in conjunction with a suitable pharmaceutical carrier. 
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(57) Abstract 

The present invention provides polynucleotides that identify and encode two human steroid binding proteins (hSBP). The invention 
provides for genetically engineered expression vectors and host cells comprising the nucleic acid sequences ^coding hSBP polypeptides. 
The invention also provides for the use of substantially purified hSBP polypeptides, antagonists, and nucleotide sequences (e.g.. antisense 
sequences) in pharmaceutical compositions for die treatment of diseases associated with the expression of hSBP. specifically in the treatment 
of breast cancer. The invention also describes diagnostic assays for the detection of breast cancer in a susceptible or affected patient Hie 
diagnostic assays utilize compositions comprising the polynucleotides encoding hSBP polypeptides or die complements thereof, which 
hybridize with the genomic sequence or the transcript of polynucleotides encoding hSBP or anti-hSBP antibodies that specifically bind to 
an hSBP polypeptide. 



FOR THE PURPOSES OF INFORMATION ONLY 
Codes used to identify States party to the PCX on the front pages of pamphlets publishing intematiQnal applications under the PCT. 



Ah 


Albania 


BS 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Anncnia 


Fl 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


VR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Austialia 


GA 


Gabon 


LV 


Latvia 


SZ 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GB 


Georgia 


MD 


Repnldic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


TtzricmcntstBi) 


BP 


Burkina Paso 


GR 


Greece 




Republic of Macedonia 


TR 


Turiccy 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IB 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United Stales of America 


CA 


Cannda 


IT 


ttaly 


MX 


Mexico 


uz 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KB 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyigyzstan 


NO . 


Norway 


zw 


Zimbabwe 


a 


Cdte d*IvDire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Rqniblic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






cu 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Pederatitm 






DE 


Gennany ■ 


U 


Uecfatenstein 


SD 


Sudan 






DK 


I>cnmart 


LK 


Sri Lanka 


SB 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Smgapore 







wo 98/21331 



PCT/US97/20674 



HUMAN BREAST TUMOR-SPECIFIC PROTEINS 
TECHNICAL FIELD 

The present invention relates to nucleic acid and amino acid sequences of proteins that are 
differentially expressed in human breast tumor cells and to the use of these sequences in the 
5 diagnosis, study/prevention and treatment of disease. 

BACKGROUND ART 
Development of breast cancer is associated with multiple genetic changes associated with 
alterations in expression of specific genes. Breast cancer tissues express genes that are not 
expressed, or expressed at lower levels, by normal breast tissue. Thus, it is possible to 
10 differentiate between normal (non-cancerous) breast tissue and cancerous breast tissue by 

analyzing differential gene expression between tissues. In addition, there may be several possible 
alterations that lead to the various possible types of breast cancer. Thus, different types of breast 
tumors (e.g., invasive vs. non-invasive, ductal vs. axillary lymph node) can be differentiable one 
from another by the identification of the differences in genes expressed by different types of 
15 breast tumor tissues (Porter-Jordan et al. 1994 Hematol Oncol Clin North Am 8:73-100). Breast 
cancer can thus be generally diagnosed by detection of expression of a gene or genes associated 
with breast tumor tissue. Where enough information is available about the differential gene 
expression between various types of breast tumor tissues, the specific type of breast tumor can 
also be diagnosed. 

2 0 Nucleotide and amino acid sequences associated with breast tumors can serve as genetic 

markers of inheritable breast cancer. Genetic changes on chromosome 1 7 are the most frequently 
identified events associated with breast tumors. At least four markers on chromosome 1 7 have 
been identified: p53 on 17p 13.1, regions of loss of heterozygosity (LOH) on 17pl3.3 and 17ql2- 
qten the breast/ovarian cancer locus (BRCA-1) on 17q2K and a fourth breast cancer growth 

25 suppressor gene on chromosome 17 (Casey et al. 1993 Hum Molec Genet 2:1921-1927). 

Such genetic markers can also be useful in identifying patients susceptible to breast 
cancer. For example, the genetic marker BRCA-1 has been linked to a susceptibility of 
developing breast and/or ovarian cancer at a young age in a number of families (Hall et al. 1990 
Science 250:1684-1689; Solomon et al. 1991 Cytogenet Cell Genet 58:686-738). The 

30 cumulative risks of developing breast cancer associated with the BRCA-1 marker are 50% at 50 
years and 82% at 70 years (Easton et al. 1993 Am J Hum Genet 52:678-701). However, since the 
gene encoding BRCA-1 has not been cloned or sequenced, identification of an individual carrier 
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of BRCA-1 is not possible without use of linkage analysis. Linkage analysis is generally not 
feasible in clinical practice since the genetic epidemiology required is tedious, if not impossible, 
in most cases (Kent et al. 1995 Europ J Surg Oncol 21 :240-241). 

The discovery of nucleotide sequences and polypeptides encoding proteins associated 
5 with breast cancer would satisfy a need in the art by providing new means of diagnosing and 
treating breast cancer. 

DISCLOSURE OF THE INVENTION 

The present invention features two human steroid binding proteins (hereinafter referred to 
individually as hSBPK and hSBP2. and collectively as hSBP), and the full-length nucleotide 

10 sequences encoding these proteins, which are differentially expressed in human breast tumor 
tissue. The transcripts encoding these proteins are present in breast tumor tissue. The first 
polypeptide, referred to hereinafter as human steroid binding protein CI (hSBPl), is 
characterized as having amino acid sequence homology to rat prostatic binding proteins CI and 
C2 (PSCI RAT and PSC2_RAr respectively) and nucleotide sequence homology to hamster 

15 FHG 22 (GI 206441). The second polypeptide, referred to hereinafter as human steroid binding 
protein C2 (hSBP2), is characterized as having identity to human mammaglobin and homology to 
rat prostatic binding protein C3 (GI 206448). Accordingly, the invention features two 
substantially purified human steroid binding proteins, as shown in amino acid sequences of SEQ 
IDNO:l and SEQ ID NO:3. 

20 One aspect of the invention features isolated and substantially purified polynucleotides 

that encode hSBP. In a panicular aspect, the polynucleotide is the nucleotide sequence of SEQ 
ID NO:2 and SEQ ID N0:4. In addition, the invention features polynucleotide sequences that 
hybridize under stringent conditions to SEQ ID NO:2 and SEQ ID N0:4. 

The invention additionally features nucleic acid sequences encoding hSBP polypeptides, 

25 oligonucleotides, peptide nucleic acids (PNA), fragments, portions or antisense molecules 

thereof, and expression vectors and host cells comprising polynucleotides that encode hSBP. The 
present invention also relates to antibodies which bind specifically to an hSBP polypeptide, 
pharmaceiitical compositions comprising substantially purified hSBP, fragments thereof, or 
antagonists of hSBP, in conjunction with a suitable pharmaceutical carrier, and methods for 

3 0 producing hSBP. 

BRIEF DESCRIPTION OF DRAWINGS 

Figure 1 shows the amino acid sequence (SEQ ID N0:1) and nucleic acid sequence (SEQ 
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ID N0:2) of human steroid binding protein CL hSBPL The alignment was produced using 
MacDNAsis software (Hitachi Software Engineering Co Ltd, San Bruno, CA). 

Figures 2A and 2B shows the amino acid sequence (SEQ ID NO:3) and nucleic acid 
sequence (SEQ ID N0:4) of human steroid binding protein C2, hSBP2 (MacDNAsis software, 
5 Hitachi Software Engineering Co Ltd). 

Figure 3 shows the northern analysis for the consensus sequence (SEQ ID N0:2) for 
hSBPl (Incyte clone 606491). The northern analysis was produced electronically using 
LIFESEQ™ database (Incyte Pharmaceuticals, Palo Alto CA). The abundance data (Abun) 
represent the number of transcripts of the gene of interest in the cDNA library. Percent 
10 abundance is calculated by dividing the number of transcripts of a gene of interest present in a 
cDNA library by the total number of transcripts in the cDNA library. 

Figure 4 shows the northern analysis for the consensus sequence (SEQ ID N0:4) 
(LIFESEQ™ database, Incyte Pharmaceuticals, Palo Alto CA). 

Figure 5 shows the amino acid sequence alignments among hSBPl (606491; SEQ ID 
15 N0:1) rat prostatic binding proteins CI and C2 (SEQ ID NOS:5 and 8), and rabbit uteroglobin 
(SEQ ID N0:9). produced using the multisequence alignment program of DNAStar software 
(DNAStar Inc. Madison WI). 

Figure 6 shows the amino acid sequence alignments among hSBP2 (SEQ ID N0:3) 
human mammaglobin (GI 1 199595; SEQ ID NO: 10). and rat prostatic binding protein C3 (GI 
20 206453; SEQ ID N0:12), produced using the multisequence alignment program of DNAStar 
software (DNAStar Inc, Madison WI). 

Figures 7 A and 7B shows the nucleotide sequence alignments between hSBPl (606491; 
SEQ ID N0:2), hamster FHG22 (GI 1045204; SEQ ID N0:7), and rat prostatic binding protein 
CI (GI 206441; SEQ ID NO:6). 
25 Figures 8 A and 8B show the nucleotide sequence alignments between hSBP2 (602516; 

SEQ ID N0:4), human mammaglobin (GI 1 199595; SEQ ID N0:1 1), and rat prostatic binding 
protein C3 (GI 206452; SEQ ID NO: 13). 

MODES FOR CARRYING OUT THE INVENTION 

Definitions 

3 0 "Nucleic acid sequence'* as used herein refers to an oligonucleotide, nucleotide or 

polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic 
origin which can be single- or double-stranded, and represent the sense or antisense strand. 
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Similarly, "amino acid sequence" as used herein refers to an oligopeptide, peptide, polypeptide, 
or protein sequence. Where "amino acid sequence" is recited herein to refer to an amino acid 
sequence of a naturally-occurring protein molecule, "amino acid sequence" and like terms (e.g., 
polypeptide, or protein) are not meant to limit the amino acid sequence to the complete, native 
5 amino acid sequence associated with the recited protein molecule. 

"Peptide nucleic acid" as used herein refers to a molecule which comprises an oligomer to 
which an amino acid residue, such as lysine, and an amino group have been added. These small 
molecules, also designated anti-gene agents, stop transcript elongation by binding to their 
complementary (template) strand of nucleic acid (Nielsen PE et al (1993) Anticancer Drug Des 
10 8:53-63). 

As used herein, "SBP" refers to the amino acid sequences of substantially purified steroid 
binding protein obtained from any species, particularly mammalian, including bovine, ovine, 
porcine, murine, equine, and preferably human, from any source whether natural, synthetic, 
semi-synthetic or recombinant. The term "hSBP" as used herein refers to human steroid binding 
15 protein and is meant to encompass hSBPl and hSBP2 polypeptides collectively. 

As used herein, "antigenic amino acid sequence" means an amino acid sequence that, 
either alone or in association with a carrier molecule, can elicit an antibody response in a 
mammal. 

A "variant of hSBP is defined as an amino acid sequence that is altered by one or more 

2 0 amino acids. The variant can have ""conservative"' changes, wherein a substituted amino acid has 

similar structural or chemical properties, e.g., replacement of leucine with isoleucine. More 
rarely, a variant can have "'nonconservative" changes, e.g., replacement of a glycine with a 
tryptophan. Similar minor variations can also include amino acid deletions or insertions, or both. 
Guidance in determining which and how many amino acid residues may be substituted, inserted 
25. or deleted without abolishing biological or immunological activity can be found using computer 
programs well known in the art, for example, DNAStar software. 

A "deletion" is defined as a change in either amino acid or nucleotide sequence in which 
one or more amino acid or nucleotide residues, respectively, are absent. 

An "insertion".or "addition" is that change in an amino acid or nucleotide sequence which 

3 0 has resulted in the addition of one or more amino acid or nucleotide residues, respectively, as 

compared to the naturally occurring hSBP. 

A "substitution" results from the replacement of one or more amino acids or nucleotides 
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by different amino acids or nucleotides, respectively. 

The term ''biologically active'' refers to a hSBP having structural, regulatory, or 
biochemical functions of a naturally occurring hSBP. Likewise, "imniunologically active" 
defines the capability of the natural, recombinant or synthetic hSBP, or any oligopeptide thereof, 
5 to induce a specific immune response in appropriate animals or cells and to bind with specific 
antibodies. 

The term "derivative" as used herein refers to the chemical modification of a nucleic acid 
encoding hSBP or the encoded hSBP. Illustrative of such modifications would be replacement of 
hydrogen by an alkyl, acyU or amino group. A nucleic acid derivative would encode a 

10 polypeptide which retains essential biological characteristics of natural hSBP. 

As used herein, the term 'substantially purified" refers to molecules, either nucleic or 
amino acid sequences, that are removed from their natural environment, isolated or separated, 
and are at least 60% free, preferably 75% free, and most preferably 90% free from other 
components with which they are naturally associated. 

15 "Stringency" typically occurs in a range from about Tm-5°C (5°C below the Tm of the 

probe)to about 20''C to 25°C below Tm. As will be understood by those of skill in the art, a 
stringency hybridization can be used to identify or detect identical polynucleotide sequences or to 
identify or detect similar or related polynucleotide sequences. 

The term "hybridization" as used herein shall include "any process by which a strand of 

20 nucleic acid joins with a complementary strand through base pairing" (Coombs J (1994) 

Dictionary of Biotechnology , Stockton Press, New York NY). Amplification as carried out in the 
polymerase chain reaction technologies is described in DiefFenbach CW and GS Dveksler (1995, 
PGR Primer , a Laboratory Manual . Cold Spring Harbor Press, Plainview NY). 
Preferred Embodiments 

25 The present invention relates to hSBP and to the use of hSBP nucleic acid and amino acid 

sequences in the study, diagnosis, prevention and treatment of disease. cDNAs encoding a 
portion of hSBP were predominantly found in cDNA libraries derived from breast timior tissue 
(Figures 3 and 4). The abundance data (Abun) reflects the relative level of expression the hSBP 
sequence in the breast, thymus and prostatic cDNA libraries, with the percentage abundance (Pet 

3 0 Abun) representing the percent of total expressed mRNAs that are homologous to the hSBP 
sequence. 

The present invention also encompasses hSBP variants. A preferred hSBP variant is one 
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having at least 80% amino acid sequence similarity to an amino acid sequence of an hSBP (i.e., 
an hSBPl amino acid sequence (SEQ ID N0:1) or an hSBP2 amino acid sequence (SEQ ID 
N0:3). A more preferred hSBP variant is one having at least 90% amino acid sequence 
similarity to SEQ ID N0;1 or SEQ ID N0:3. A most preferred hSBP variant is one having at 
5 least 95% amino acid sequence similarity to SEQ ID NO: 1 or SEQ ID N0:3, 

Nucleic acids encoding the human hSBP of the present invention were first identified in 
cDNA, Incyte Clones 606491 and 602615 from breast tumor cell cDNA library BRSTTUTOl 
through a computer-generated search for amino acid sequence alignments. A consensus sequence 
for each of hSBPl (SEQ ID NO:2) and hSBP2 (SEQ ID NO: 4) was derived fi-om the 
10 overlapping and/or extended nucleic acid sequences as shown in the tables below. 

Table 1. Clones from which the consensus sequence (SEQ ID N0:2) of hSBP-Cl was derived. 



Sequence I.D. 


cDNA Library 


Sequence I.D. 


cDNA Librar>' 


Sequence I.D. 


cDNA Library 


4194I2H1 


BRSTNOTOl 


606371 HI 


BRSTTUTOl 


I21274IH1 


BRSTTUTOl 


603148HI 


BRSTTUTOl 


606491 HI 


BRSTTUTOl 


I215122HI 


BRSTTUTOl 


603224HI 


BRSTTUTOl 


8255 19H I 


PROSNOT06 


I216374HI 


BRSTTUTOl 


604290H1 


BRSTTUTOl 


967077HI 


BRSTNOT05 


I217152H1 


BRSTTUTOl 


604954H1 


BRSTTUTOl 


1209955HI 


BRSTNOT02 






60S120HI 


BRSTTUTOl 


12I2005H1 


BRSTTUTOl 
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Table 2. Clones from which the consensus sequence (SEQ ID N0:4) of hSBP-C2 was derived. 



Sequence I.D. 


cDNA Library 


Sequence I.D. 


cDNA Library 


Sequence I.D. 


cDNA Library 


410758HI 


BRSTNOTOl 


899784 


BRSTTUT03 


968I63HI 


BRSTNOT05 


4I9059Hi 


BRSTNOTOI 






977969HI 


BRSTNOT02 


598065H1 


BRSTNOT02 


899895HI 


BRSTTUT03 


10000571 HI 


BRSTNOT03 


60IOOOHI 


BRSTNOT02 


9001 18H1 


BRSTTUT03 


1002776HI 


BRSTNOT03 


6026 15H1 


BRSTTUTOl 


901009H1 


BRSTTUT03 


I004904HI 


BRSTNOT02 


603548H1 


BRSTTUTOl 


902666H1 


BRSTTUT03 


1210748HI 


BRSTNOT02 


603234H1 


BRSTTUTOl 


902354H1 


BRSTTUT03 






603999HI 


BRSTTUTOl 


9592 13HI 


BRSTTUT03 


12I2473H1 


BRSTTUTOl 


605093H1 


BRSTTUTOl 


959506H1 


BRSTTUT03 


1213350HI 


BRSTTUTOl 


605204HI 


BRSTTUTOl 


960045HI 


BRSTTUT03 


1213570HI 


BRSTTUTOl 


6052 15H1 


BRSTTUTOl 


9601 I8H1 


BRSTTUT03 


I213702HI 


BRSTTUTOl 


60556 IHl 


BRSTTUTOl 


960656Hr 


BRSTTUT03 


I2I4253HI 


BRSTTUTOl 


6p6l91Hl 


BRSTTUTOl 


962I53H1 


BRSTTUT03 


I2I4304HI 


BRSTTUTOl 


606289H1 


BRSTTUTOl 


962283H1 


BRSTTUT03 


I2I4401H1 


BRSTTUTOl 


60661 IHl 


BRSTTUTOl 


962488H1 


BRSTTUT03 


1215366H1 


BRSTTUTOl 


AHAAAdM 1 
ouoooHn 1 


BRSTTUTOl 


96'>656Ht 


BRSTTUT03 


12I5626H1 


BRSTTUTOl 


607089HI 


BRSTTUTOl 


962907HI 


BRSTTUT03 


I216546H1 


BRSTTUTOl 


897552HI 


BRSTNOT05 


963043 HI 


BRSTTUT03 


I2I6653HI 


BRSTTUTOl 


8985 16HI 


BRSTTUT03 


963046 HI 


BRSTTUT03 


12I6659H1 


BRSTTUTOl 


898821 HI 


BRSTTUT03 


964108HI 


BRSTTUT03 


12I6778H1 


BRSTTUTOl 


899628H1 


BRSTTUT03 


968127H1 


BRSTNOT05 







The nucleic acid sequence of SEQ ID N0:2 encodes the hSBPl amino acid sequence, SEQ ID 
NO: 1 . The nucleic acid sequence of SEQ ID N0:4 encodes the hSBP2 amino acid sequence, 
SEQ ID N0:3. 

The present invention is based, in part, on the chemical and structural homology between: 
1) The amino acid sequence of hSBPl and rat prostatic binding protein CI (GI 206442; Delaey et 
al. 1983 Eur J Biochem 133:645-649) rat prostatic binding protein C2 (Delaey et al. 1987 Nucl 
Acid Res 15:1627-1641 and rabbit uteroglobin (Menne et al. 1982 Proc Natl Acad Sci USA 
79:4853-4857; Figure 5) and the amino acid sequences of hSBP2, human mammaglobin (GI 
1 100595; SEQ ID NO: 10) and rat prostatic binding protein C3 (GI 206453; SEQ ID N0:12; 
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Figure 6); and 2) The nucleotide sequence encoding hSBPK rat prostatic binding protein CI 
(GI 206442; Delaey et al. supra), and hamster FHG22 (GI 1045204; Dominguez 1995 FEBS 
Letters 376:257-263; Figures 7A and 7B); and hSBP2, human mammaglobin (GI 1 199595; 
Walson et al. 1996 Cancer Res 56:860-865), and rat prostatic binding protein C3 (GI 206452; 
5 Parker et al. 1983 J Biol Chem 258:12-15) (Figures 8 A and 8B). 

Rat prostatic binding protein (rPBP) is a tetrameric, steroid-binding glycoprotein found in 
rat ventral prostate, and is the principal protein in rat prostatic fluid (Delaey et al. supra; Parker et 
al. supra; Heyns et al. 1977 Eur J Biochem 78:221-230; Heyns et al. 1977 Biochem Biophys Res 
Commun 77:1492-1499; Parker et al. 1978 Eur J Biochem 85:399-406). The rPBP tetramer is 

10 composed of two subunits: one subunit containing the polypeptides CI and C3; and the other 
subunit containing the polypeptides C2 and C3 (Heyns et al. 1978 Eur J Biochem 89:181-186). 
rPBP C3 is homologous to human mammaglobin. which in turn is homologous to human Clara 
cell 10-kilodalton protein and rabbit uteroglobin (Watson et al. supra). 

Although rat PBP is primarily expressed in the testes (Lindzey et al. 1994 Vitamins 

15 Hormones 49:383-32), transgenic animals harboring a construct containing the 5* flanking region 
of the rat PBP-C3 gene linked to the coding region for the simian virus 40 large tumor antigen 
express the transgene in both the prostate and the mammary gland (Allison et al. 1989 Mol Cell 
Biol 9(5): 2254-2257). The expression of the C3 transgene varies with the sex of the transgenic 
animal; male transgenic animals express the rat PBP C3 transgene in the prostate and develop 

20 prostate carcinoma, while the females express the transgene in the mammary gland and develop 
atypical mammar\' hyperplasia (Maroulakou et al. 1994 Proc Natl Acad Sci USA 91 :11236-40). 
Expression of rPBP is regulated by androgenic steroid (e.g., testosterone) partly by stimulating 
rates of transcription and partly by effects on RNA stability (Parker el al. 1977 Cell 12:401-407; 
Heyns et al. 1977 Biochem Biophys Res Commun 77:1492-1499; Parker et al. 1979 Proc Natl 

25 Acad Sci USA 76:1580-1584; Page etal. 1982 Mol Cell Endocr 27:343-355). 

rPBP is similar to estramucine binding protein (EMBP) (Heyns et al. 1977 Eur J Biochem 
78:221-30). EMBP is a 46-kDa heterodimer consisting of two closely related subunits, which 
upon reductive cleavage of disulfide bridges, each subunit is divided into two components. The 
subunits differ with respect to the components CI and C2, but share C3 (Bjork et al. 1995 The 

30 Prostate (1995) 27:70-83). EMBP binds estramucine (Appelgren et al. 1979 Acta Pharmacol 
Toxicol 43:368-74; Forsgren et al. 1979 Cancer Res 39:5 155-64; Heisaeter et al. 1981 J Steroid 
Biochem 14: 251-60), but does not bind free estrogens (Hoisaeter et al. 1981 J Steroid Biochem 
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14:251-260; Forsgren et al. 1979 Proc Natl Acad Sci USA 76:3149-3150). Estramucine, a 
nitrogen mustard derivative of 1 7p-estradiol (Mittelman et al. 1 977 Cancer Treat Rep 6 1 :307- 1 0; 
Johnson et al. 1971 Scand J Urol Nephrol 5:103-7), is used to treat patients with prostatic 
carcinoma. Expression of EMBP is androgen-regulated; this androgen-<iependency of EMBP 
5 tends to decline with the transformation of prostatic tissue into biologically more malignant 
disease (Shiina et al. 1996 Brit J Urol 77:96-101). The ratio of EMBP to dihydroxytestosterone 
is an indicator of the malignant potential of prostatic carcinoma (Shiina et al. supra). 

Rabbit uteroglobin, a homodimeric protein coupled by two disulfide linkages, binds 
progesterone and structurally related steroids, is also a substrate for transglutaminases, inhibits 

10 phospholipase A^ activity, and may interfere with the immune and inflammatory activity of 

several cell types (Miele et al. 1994 J Endocrinol Invest 17:679-692; Miele et al. 1987 Endocrinol 
Rev 8:474-490). Expression of uteroglobin is regulated by tissue-specific response to steroid 
hormones (SandmoUer et al. 1994 Oncogene 9:2805-2815). 

FHG22 protein was isolated from a female minus male subtracted cDNA library obtained 

15 from the sexually dimorphic Syrian hamster Harderian glands (Dominguez supra). FHG 

nucleotide and amino acid sequence are similar to the subunits from rat prostatic steroid binding 
protein CI. uteroglobin (Miele et al. 1994 J Endocrinol Invest 17:679-692), major cat allergen 
Pel dl (chain I), and mouse salivary androgen binding proteins (subunit a) (Kam et al. 1993 
Biochem Genet 32:271-277; Dominguez supra). Expression of FHG22 is tissue and sex- 

2 0 dependent (Dominguez supra). 

hSBPl and rat prostatic binding protein CI share 55% nucleotide sequence identity at the 
nucleotide sequence level, whereas hSBPl and hamster FHG22 share 72% nucleotide sequence 
identity. hSBPl is 90 amino acids in length; the amino acid sequence of hSBPl has 49% identity 
with the amino acid sequence of rat prostatic binding protein CI (SEQ ID N0:5), 44% identity 
25 with the amino acid sequence of rat prostatic binding protein C2 (SEQ ID N0:8), and 28% 
identity with the amino acid sequence of rabbit uteroglobin (SEQ ID N0:9) (Figure 5). 

hSBP2 is 93 amino acids in length and shares 99% nucleotide sequence identity with 
human mammaglobin; the nucleotide sequence of hSBP2 is about 43% identical to the nucleotide 
sequence of rat prostatic binding protein C3 (Figures 8A and 8B). The amino acid sequence of 

3 0 hSBP2 is 62% identical to the amino acid sequence of rat prostatic protein C3, and 1 00% 

identical to the amino acid sequence of human manmiaglobin (Figure 6). Thixs, hSBP-C3 is 
identical to human mammaglobin. 
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The hSBP Coding Sequences 

The nucleic acid and deduced amino acid sequences of hSBP are shown in Figures 1 
(hSBPl) and 2 A and 2B (hSBP2). In accordance with the invention, any nucleic acid sequence 
that encodes an amino acid sequence of an hSBP polypeptide can be used to generate 
5 recombinant molecules which express an hSBP polypeptide. In specific embodiments described 
herein, a nucleotide sequence encoding a portion of hSBPl was first isolated as Incyte Clone 
606491 from a breast tumor cell line cDNA library BRSTTUTOl; and a nucleotide sequence 
encoding a portion of hSBP2 was first isolated as Incyte Clone 602615 from a breast tumor cell 
line cDNA librar>' BRSTTUTOl . 

10 It will be appreciated by those skilled in the art that as a result of the degeneracy of the 

genetic code, a multitude of degenerate variants of hSBP-encoding nucleotide sequences, some 
bearing minimal homology to the nucleotide sequences of any known and naturally occurring 
gene, can be produced. The invention contemplates each and every possible variation of 
nucleotide sequence that can be made by selecting combinations based on possible codon 

15 choices. These combinations are made in accordance with the standard triplet genetic code as 
applied to the nucleotide sequence of naturally occurring hSBP, and all such variations are to be 
considered as being specifically disclosed herein. 

Although nucleotide sequences that encode hSBP and its variants are preferably capable 
of hybridizing to the nucleotide sequence of the naturally occurring hSBP under appropriately 

20 selected conditions of stringency, it may be advantageous to produce nucleotide sequences 

encoding hSBP or its derivatives possessing a substantially different codon usage. Codons can 
be selected to increase the rate at which expression of the peptide occurs in a particular 
prokaryotic or eukaryotic expression host in accordance with the frequency with which particular 
codons are utilized by the host. Other reasons for substantially altering the nucleotide sequence 

25 encoding hSBP and its derivatives without altering the encoded amino acid sequences include the 
production of RNA transcripts having more desirable properties (e.g., increased half-life) than 
transcripts produced from the naturally occurring sequence. 

It is now possible to produce a nucleotide sequence encoding an hSBP polypeptide and/or 
its derivatives entirely by synthetic chemistry, after which the synthetic gene can be inserted into 

3 0 any of the many available DNA vectors and expression systems using reagents that are well 

known in the art at the time of the filing of this application. Moreover, synthetic chemistry can 
be used to introduce mutations into a sequence encoding an hSBP polypeptide. 

-10- 
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Also included within the scope of the present invention are polynucleotide sequences that 
are capable of hybridizing to the nucleotide sequences of Figures 1 A-B and/or 2A-B under 
various conditions of stringency. Hybridization conditions are based on the melting temperature 
(Tm) of the nucleic acid binding complex or probe, as taught in Berger and Kimmel (1987, 
5 Guide to Molecular Cloning Techniques . Methods in Enzvmologv . Vol 1 52, Academic Press, 
San Diego CA) incorporated herein by reference, and can be used at a defined stringency. 

Altered nucleic acid sequences encoding hSBP that can be used in accordance with the 
invention include deletions, insertions or substitutions of different nucleotides resulting in a 
polynucleotide that encodes the same or a functionally equivalent hSBP. The protein can also 

10 comprise deletions, insertions or substitutions of amino acid residues that result in a polypeptide 
that is functionally equivalent to hSBP. Deliberate amino acid substitutions can be made on the 
basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity. and/or the 
amphipathic nature of the residues with the proviso that biological activity of hSBP is retained. 
For example, negatively charged amino acids include aspartic acid and glutamic acid; positively 

15 charged amino acids include lysine and arginine: and amino acids with uncharged polar head 

groups having similar hydrophilicity values include leucine, isoleucine. valine; glycine, alanine; 
asparagine. glutamine; serine, threonine phenylalanine, arid tyrosine. 

Alleles of hSBP are also encompassed by the present invention. As used herein, an 
"allele'' or "allelic sequence" is an alternative form of hSBP. Alleles result from a mutation (i.e.. 

20 an alteration in the nucleic acid sequence) and generally produce altered mRNAs and/or 

polypeptides that may or may not have an altered structure or function relative to naturally- 
occurring hSBP. Any given gene may have none, one. or many allelic forms. Common 
mutational changes that give rise to alleles are generally ascribed to natural deletions, additions 
or substitutions of amino acids. Each of these types of changes may occur alone or in 

25 combination with the other changes, and may occur once or multiple times in a given sequence. 

Methods for DNA sequencing are well known in the art and employ such enzymes as the 
Klenow fragment of DNA polymerase L Sequenase® (US Biochemical Corp, Cleveland OH)), 
Taq polymerase (Perkin Elmer, Norwalk CT), thermostable T7 polymerase (Amersham, Chicago 
IL), or combinations of recombinant polymerases and proofreading exonucleases such as the 

30 ELONGASE Amplification System marketed by Gibco BRL (Gaithersburg MD). Preferably, the 
process is automated with machines such as the Hamilton Micro Lab 2200 (Hamilton, Reno NV), 
Peltier Thermal Cycler (PTC200; MJ Research. Watertown MA) and the ABI 377 DNA 
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sequencers (Perkin Elmer). 

Extending the Polynucleotide Sequence 

The polynucleotide sequence encoding hSBP can be extended utilizing partial nucleotide 
sequence and various methods known in the art to detect upstream sequences such as promoters 
5 and regulatory elements. Clones that contain extended sequences are designated by a suffix (see 
the tables above). Gobinda et al (1993; PGR Methods Applic 2:318-22) disclose 
"restriction-site" polymerase chain reaction (PCR) as a direct method which uses universal 
primers to retrieve unknown sequence adjacent to a known locus. First, genomic DNA is 
amplified in the presence of primer to a linker sequence and a primer specific to the known 

10 region. The amplified sequences are subjected to a second round of PCR with the same linker 
primer and another specific primer internal to the first one. Products of each round of PCR are 
transcribed with an appropriate RNA polymerase and sequenced using reverse transcriptase. 

Inverse PCR can be used to* amplify or extend sequences using divergent primers based 
on a known region (Triglia T et al (1988) Nucleic Acids Res 16:8186). The primers can be 

15 designed using OLIGO® 4.06 Primer Analysis Software (1992; National Biosciences Inc, 

Plymouth MN), or smother appropriate program, to be 22-30 nucleotides in length, to have a GC 
content of 50% or more, and to anneal to the target sequence at temperatures about 68*'-72® C. 
This method uses several restriction enzymes to generate a suitable fragment in the known region 
of a gene. The fragment is then circularized by intramolecular ligation and used as a PCR 

20 template. 

Capture PCR (Lagerstrom M et al ( 1 99 1 ) PCR Methods Applic 1 : 1 1 1 - 1 9) is a method for 
PCR amplification of DNA fragments adjacent to a known sequence in human and yeast jirtificial 
chromosome DNA. Capture PCR also requires multiple restriction enzyme digestions and 
ligations to place an engineered double-stranded sequence into an unknown f)ortion of the DNA 

25 molecule before PCR. 

Another method that can be used to retrieve unknown sequences is that of Parker JD et al 
(1991; Nucleic Acids Res 19:3055-60). Additionally, one can use PCR, nested primers, and 
PromoterFinder libraries to "walk in" genomic DNA (PromoterFinder***^ Clontech (Palo Alto 
CA). This process avoids the need to screen libraries and is useful in finding intron/exon 

3 0 junctions. Preferably, the libraries used to identify full length cDN As have been size-selected to 
include larger cDNAs. More preferably, the cDNA libraries used to identify full-length cDNAs 
are those generated using random primers, in that such libraries will contain more sequences 
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comprising regions 5* of the sequence(s) of interest. A randomly primed library can be 
particularly useful where oligo d(T) libraries do not yield a full-length cDNA. Genomic libraries 
arc preferred for identification and isolation of 5' nontranslated regulatory regions of a 
sequence(s) of interest. 

5 Capillary electrophoresis can be used to analyze the size of, or confirm the nucleotide 

sequence of. sequencing or PCR products. Systems for rapid sequencing are available from 
Perkin Elmen Beckman Instruments (Fullerton CA), and other companies. Capillary sequencing 
can employ flowable polymers for electrophoretic separation, four different, laser-activatable 
fluorescent dyes (one for each nucleotide), and a charge coupled device camera for detection of 

10 the wavelengths emitted by the fluorescent dyes. Output/light intensity is converted to electrical 
signal using appropriate software (e.g. Genotyper™ and Sequence Navigator'"''^ from Perkin 
Elmer). The entire process from loading of the samples to computer zmalysis and electronic data 
display is computer controlled. Capillary electrophoresis is particularly suited to the sequencing 
of small pieces of DNA that might be present in limited amounts in a particular sample. 

15 Capillary electrophoresis provides reproducible sequencing of up to 350 bp of M13 phage DNA 
in 30 min (Ruiz-Martinez MC et al (1993) Anal Chem 65:2851-2858). 
Expression of the Nucleotide Sequence 

In accordance with the present invention, polynucleotide sequences that encode hSBP 
polypeptides (which polypeptides include fragments of the naturally-occurring polypeptide. 

20 fusion proteins, and functional equivalents thereof) can be used in recombinant DNA molecules 
that direct the expression of hSBP in appropriate host cells. Due to the inherent degeneracy of 
the genetic code, other DNA sequences that encode substantially the same or a functionally 
equivalent amino acid sequence, can be used to clone and express hSBP. As will be understood 
by those of skill in the art, it may be advantageous to produce hSBP-encoding nucleotide 

25 sequences possessing non-naturally occurring codons. Codons preferred by a particular 
prokaryotic or eukaryotic host*(Murray E et al (1989) Nuc Acids Res 17:477-508) can be 
selected, for example, to increase the rate of hSBP expression or to produce recombinant RNA 
transcripts having a desirable characteristic(s) (e.g., longer half-life than transcripts produced 
from naturally occurring sequence). 

3 0 The nucleotide sequences of the present invention can be engineered in order to alter an 

hSBP coding sequence for a variety of reasons, including but not limited to, alterations that 
facilitate the cloning, processing and/or expression of the gene product. For example, mutations 
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can be introduced using techniques that are well known in the art, e.g., site-directed mutagenesis 
to insert new restriction sites, alter glycosylation patterns, change codon preference, produce 
splice variants, etc. 

In another embodiment of the invention, a natural, modified, or recombinant 
5 polynucleotide encoding an hSBP polypeptide can be ligated to a heterologous sequence to 
encode a fusion protein. For example, where an hSBP polypeptide is to be used in a peptide 
library for screening and identification of inhibitors of hSBP activity, it may be desirable to 
provide the hSBP polypeptide in the peptide library as a chimeric hSBP protein that can be 
recognized by a commercially available antibody. A fusion protein can also be engineered to 

10 contain a cleavage site located between an hSBP polypeptide-encoding sequence and a 

heterologous polypeptide sequence, such that the hSBP polypeptide can be cleaved and purified 
away from the heterologous moiety. 

In an alternative embodiment of the invention, a nucleotide sequence encoding an hSBP 
polypeptide can be synthesized, in whole or in part, using chemical methods well known in the 

15 art (see Caruthers et al (1980) Nuc Acids Res Symp Ser 215-23, Horn et al(1980) Nuc Acids Res 
Symp Ser 225-32, etc). Alternatively, the polypeptide itself can be produced using chemical 
methods to synthesize an hSBP amino acid sequence, in whole or in part. For example, peptide 
synthesis can be performed using various solid-phase techniques (Roberge et al (1995) Science 
269:202-204) and automated synthesis can be achieved, for example, using the ABI 431 A 

20 Peptide Synthesizer (Perkin Elmer) in accordance with the instructions provided by the 
manufacturer. 

The newly synthesized peptide can be substantially by preparative high performance 
liquid chromatography (e.g., Creighton (1983) Proteins . Structures ajld Molecular Prinqiples, WH 
Freeman and Co, New York NY). The composition of the synthetic peptides can be confirmed 
25 by amino acid analysis or sequencing (e.g., the Edman degradation procedure; Creighton, supra). 
Additionally the amino acid sequence of hSBP, or any part thereof, can be altered during direct 
synthesis and/or combined using chemical methods with sequences from other proteins, or any 
part thereof, to produce a variant polypeptide. 
Expression Systems 

30 In order to express a biologically active hSBP polypeptide, the nucleotide sequence 

encoding an hSBP polypeptide or its functional equivalent, is inserted into an appropriate 
expression vector, i.e., a vector having the necessary elements for the transcription and translation 
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of the inserted coding sequence. 

Methods well known to those skilled in the art can be used to construct expression vectors 
comprising an hSBP polypeptide-encoding sequence and appropriate transcriptional or 
iranslational controls. These methods include in vitro recombinant DNA techniques, synthetic 
5 techniques and in vivo recombination or genetic recombination. Such techniques are described in 
Sambrook et al (1 989) Molecular Cloning . A Laboratory Manual . Cold Spring Harbor Press, 
Plainview NY and Ausubel FM et al (1989) Current Protocols m MQkmkx BlQlpgY, John Wiley 
& Sons, New York NY. 

A variety of expression vector/host systems can be utilized to express an hSBP 
10 polypeptide-encoding sequence. These include, but are not limited to, microorganisms such as 
bacteria transformed with recombinant bacteriophage, plasmid or cosmid DNA expression 
vectors; yeast transformed with yeast expression vectors; insect cell systems infected with virus 
expression vectors (e.g., baculovirus); plant cell systems transfected with virus expression vectors 
(e.g., cauliflower mosaic virus, CaMV: tobacco mosaic virus, TMV) or transformed with 
15 bacterial expression vectors (e.g., Ti or pBR322 plasmid); or animal cell systems. 

The "control elements" or ''regulatory sequences" of these systems, which vary in their 
strength and specificities, are those nontranslated regions of the vector, enhancers, promoters, and 
3' untranslated regions that interact with host cellular proteins to facilitate transcription and 
translation of a nucleotide sequence of interest. Depending on the vector system and host 

2 0 utilized, any number of suitable transcriptional and translational elements, including constitutive 

and inducible promoters, can be used. For example, when cloning in bacterial systems, inducible 
promoters such as the hybrid lacZ promoter of the Bluescript® phagemid (Stratagene. La Jolla 
CA) or pSportl (Gibco BRL), ptrp-lac hybrids, and the like can be used. The baculovirus 
polyhedron promoter can be used in insect cells. Promoters or enhancers derived from the 
25 genomes of plant cells (e.g., heat shock, RUBISCO: and storage protein genes) or from plant 
viruses (e.g., viral promoters or leader sequences) can be cloned into the vector. In mammalian 
cell systems, promoters from the mammalian genes or from mammalian viruses are most 
appropriate. Where it is desirable to generate a cell line containing multiple copies of an hSBP 
polypeptide-encoding sequence, vectors derived from SV40 or EBV can be used in conjunction 

3 0 with other optional vector elements, e.g., an appropriate selectable marker. 

In bacterial systems, a number of expression vectors c£m be used to express an hSBP 
polypeptide of interest, and will vary with a variety of factors including the intended use intended 
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for the hSBP polypeptide produced. For example, when large quantities of an hSBP polypeptide 
are required (e.g., for the antibody production), vectors that direct high-level expression of fusion 
proteins that can be readily purified may be desirable. Such vectors include, but are not limited 
to, the multifunctional E. coli cloning and expression vectors such as Bluescript® (Stratagene; 
5 which provides for in-frame ligation of a hSBP polypeptide-encoding sequence with sequences 
encoding the amino-terminal Met and the subsequent 7 residues of B-galactosidase, thereby 
producing an hSBP polypeptide-B-galactosidase hybrid protein); pIN vectors (Van Heeke & 
Schuster (1989) J Biol Chem 264:5503-5509); and the like. pGEX vectors (Promega, Madison 
WI) can also be used to express foreign polypeptides as glutathione S-transferase (GST) fusion 
10 proteins, in general, such GST fusion proteins are soluble and can be easily purified from cell 
lysates by adsorption to glutathione-agarose beads followed by elution in the presence of free 
glutathione. GST fusion proteins can be designed to include heparin, thrombin or factor XA 
protease cleavage sites so that the cloned polypeptide of interest can be readily separated from the 
GST moiety. 

15 Where the host cell is yeast (e.g., Saccharomyces cerevisiae ) a number of vectors 

containing constitutive or inducible promoters such as alpha factor, alcohol oxidase and PGH can 
be used. For reviews, see Ausubel et al (supra) and Grant et al (1987) Methods in Enzymology 
153:516-544. 

Where plant expression vectors are used, the expression of an hSBP polypeptide-encoding 
20 sequence can be driven by any of a number of promoters. For example, viral promoters such as 
the 35S and 19S promoters of CaMV (Brisson et al (1984) Nature 3 10:5 11 -5 14) can be used 
alone or in combination with the omega leader sequence from TMV (Takamatsu et al (1987) 
EMBO J 6:307-31 1). Alternatively, plant promoters, such as the small subunit of RUBISCO 
(Coruzzi et al (1984) EMBO J 3:1671-1680; BrogHe et al (1984) Science 224:838-843) or heat 
25 shock promoters (Winter J and Sinibaldi RM (1991) Results Probl Cell Differ 17:85-105), can be 
used. These constructs can be introduced into plant cells by direct DNA transformation or 
pathogen-mediated transfection. For reviews of such techniques, see Hobbs S or Murry LE in 
McGraw Hill Yearbook of Science aqd Technology (1992) McGraw Hill New York NY, pp 
1 9 1 - 1 96 or Weissbach and Weissbach ( 1 988) Methods for Plant Molecular Biology , Academic 
3 0 Press, New York NY, pp 42 1 -463. 

Alternatively, insect cell expression systems can be used to express an hSBP polypeptide.. 
In one such system, Autographa califomica nuclear polyhedrosis virus (AcNPV) is used as a 
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vector to express foreign genes in Spodoptera frugiperda cells or in Trichoplusia larvae. The 
hSBP polypeplide-encoding . sequence can be cloned into a nonessential region of the virus, such 
as the polyhedron gene, and placed under control of the polyhedron promoter. Successful 
insertion of hSBP renders the polyhedron gene inactive and produces recombinant virus lacking 
5 coat protein. The recombinant viruses are then used to infect S. frugiperda cells or Trichoplusia 
larvae for expression of hSBP polypeptide (Smith et al (1983) J Virol 46:584; Engelhard EK et al 
( 1 994) Proc Nat Acad Sci 9 1 :3224-7). 

Where the host cell is a mammalian cells, a niunber of viral-based expression systems can 
be used. For example, the expression vector can be derived from an adenovirus nucleotide 

10 sequence. An hSBP polypeptide-encoding sequence can be ligated into an adenovirus 

transcription/translation complex, which is composed of the late promoter and tripartite leader 
sequence. Insertion of the nucleotide sequence of interest into a nonessential El or E3 region of 
the viral genome will result in the production of a viable virus capable of expressing hSBP 
polypeptide in infected host cells (Logan and Shenk (1984) Proc Natl Acad Sci 81 :3655-59). In 

15 addition, transcriptional enhancers, such as the Rous sarcoma virus (RSV) enhancer, can be used 
to increase expression in mammalian host cells. 

Specific initiation signals may also be required for efficient translation of an hSBP 
polypeptide-encoding sequence, e.g., the ATG initiation codon and flanking sequences. Where a 
native hSBP polypeptide encoding sequence, its initiation codon and upstream sequences are 

20 inserted into the appropriate expression vector, no additional translational control signals may be 
needed. However^ where only coding sequence, or a portion thereof, is inserted in an expression 
vector, exogenous transcriptional control signals including the ATG initiation codon must be 
provided. Furthermore, the initiation codon must be in the correct reading frame to ensure 
transcription of the entire insert. Exogenous transcriptional elements and initiation codons can be 

25 derived from various origins, and can be either natural or synthetic. Expression efficiency can be 
enhanced by including enhancers appropriate to the cell system in use (Scharf D et al (1994) 
Results Probl Cell Differ 20:125-62; Bittneret al (1987) Methods in Enzymol 153:516-544). 

Host cells can be selected for hSBP polypeptide expression according to the ability of the 
cell to modulate the expression of the inserted sequences or to process the expressed protein in a 

3 0 desired fashion. Such modifications of the polypeptide include, but are not limited to, 
acetylation, carboxylation, glycosylation, phosphorylation, lipidation and acylation. 
Post-translational processing that involves cleavage of a "prepro" form of the protein may also be 
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important for correct polypeptide folding, membrane insertion, and/or function. Host cells such 
as CHO, HeLa, MDCK, 293, WI38, and others have specific cellular machinery and 
characteristic mechanisms for such post-translational activities and may be chosen to ensure the 
correct modification and processing of the introduced, foreign polypeptide. 
5 Where long-term, high-yield recombinant polypeptide production is desired, stable 

expression is preferred. For example, cell lines that stably express hSBP can be transformed 
using expression vectors containing viral origins of replication or endogenous expression 
elements and a selectable marker gene. After introduction of the vector, cells can be grown for 
1-2 days in an enriched media before they are exposed to selective media. The selectable marker. 

10 which confers resistance to the selective media, allows growth and recover)' of cells that 
successfully express the introduced sequences. Resistant, stably transformed cells can be 
proliferated using tissue culture techniques appropriate to the host cell type. 

Any number of selection systems can be used to recover transformed cell lines. These 
include, but are not limited to, the herpes simplex virus thymidine kinase (Wigler M et al (1977) 

15 Cell 1 1 :223-32) and adenine phosphoribosyltransferase (Lowy I et al (1980) Cell 22:817-23) 
genes which can be employed in tk- or aprt- cells, respectively. Also, antimetabolite, antibiotic 
or herbicide resistance can be used as the basis for selection; for example, dhfr which confers 
resistance to methotrexate (Wigler M et al (1980) Proc Natl Acad Sci 77:3567-70); npt, which 
confers resistance to the aminoglycosides neomycin and G-418 (Colbere-Garapin F et al (1981) J 

2 0 Mol Biol 150: 1 -14) and als or pat, which confer resistance to chlorsulfuron and phosphinotricin 
acetyltransferase. respectively (Murry, supra). Additional selectable genes have been described, 
for example, trpB. which allows cells to utilize indole in place of tryptophan, or hisD, which 
allows cells to utilize histinol in place of histidine (Hartman SC and RC Mulligan (1988) Proc 
Natl Acad Sci 85:8047-51). Recently, the use of visible markers has gained popularity with such 

25 markers as anthocyanins, B-glucuronidase and its substrate, GUS, and luciferase and its substrate, 
luciferin, being widely used not only to identify transformants, but also to quantify the amount of 
transient or stable protein expression attributable to a specific vector system (Rhodes CA et al 
(1995) Methods Mol Biol 55:121-131). 

Identification of Transformants Containing the Polynucleotide Sequence 

30 Although the presence/absence of marker gene expression suggests that the gene of 

interest is also present, its presence and expression should be confirmed. For example, if the 
hSBP polypeptide encoding sequence is inserted within a marker gene sequence, recombinant 
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cells containing this sequence can be identified by the absence of marker gene function. 
Alternatively, a marker gene can be placed in tandem with a hSBP sequence under the control of 
a single promoter. Expression of the marker gene in response to induction or selection is 
indicative of expression of the tandem hSBP. 

Alternatively, host cells that contain the coding sequence for hSBP polypeptides and 
express hSBP polypeptides can be identified by a variety of procedures known to those of skill in - 
the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridization 
and protein bioassay or immunoassay techniques including membrane, solution, or chip-based 
technologies for the detection and/or quantitation of the nucleic acid or protein. 

The presence of the polynucleotide sequence encoding hSBP polypeptides can be detected 
by DNA-DNA or DNA-RNA hybridization or amplification using probes, portions or fragments 
of polynucleotides encoding hSBP. Nucleic acid amplification-based assays involve the use of 
oligonucleotides or oligomers based on the hSBP polypeptide-encoding sequence to detect 
transformants containing hSBP polypeptide-encoding DNA or RNA. As used herein 
"oligonucleotides" or "'oligomers' refer to a nucleic acid sequence of at least about 10 
nucleotides and as many as about 60 nucleotides, preferably about 15 to 30 nucleotides, and more 
preferably about 20-25 nucleotides which can be used as a probe or amplimer. 

A variety of protocols for detecting and measuring the expression of hSBP, using either 
polyclonal or monoclonal antibodies specific for the protein are known in the art. Examples 
include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA) and fluorescent 
activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing monoclonal 
antibodies reactive to two non-interfering epitopes on hSBP is preferred, but a competitive 
binding assay can be employed. These and other assays are described in, e.g., Hampton R et al 
( 1 990, Serological Methods , a Laboratorv Manual . APS Press, St Paul MN) and Maddox DE et al 
(1983, J Exp Med 158:1211). 

A wide variety of detectable labels and conjugation techniques are known by in the art 
and can be used in various nucleic acid and amino acid assays. Means for producing labeled 
hybridization or PGR probes for detecting sequences related to hSBP-encoding polynucleotides 
include oligolabeling, nick translation, end-labeling or PGR amplification using a labeled 
nucleotide. Alternatively, an nucleotide sequence encoding an hSBP polypeptide can be cloned 
into a vector for the production of an mRN A probe. Such vectors, which are known in the art and 
commercially available, can be used to synthesize RNA probes in vitro by addition of an 
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appropriate RNA polymerase such as T7, T3 or SP6 and labeled nucleotides. 

A number of companies, including Pharmacia Biotech (Piscataway NJ), Promega 
(Madison WI), and US Biochemical Corp (Cleveland OH), supply commercial kits and protocols 
suitable for the methods described above. Suitable reporter molecules or labels include those 
5 radionuclides, enzymes, fluorescent, chemiluminescent. or chromogenic agents as well as 

substrates, cofactors, inhibitors, magnetic particles and the like, as described in U.S. Patent Nos. 
3,817,837; 3,850J52; 3,939,350; 3,996345; 4,277,437; 4275,149 and 4,366,241, each of which 
are incorporated herein by reference. Recombinant immunoglobulins can be produced as 
according to U.S. Patent No. 4,8 16,567, incorporated herein by reference. 
10 Purification of hSBP 

Host cells transformed with a nucleotide sequence encoding an hSBP polypeptide can be 
cultured under conditions suitable for the expression and recovery of the hSBP polypeptide from 
cell culture. The polypeptide produced by a recombinant cell may be secreted or retained 
intracellularly depending on the sequence and/or the vector used. As will be understood by those 
15 of skill in the art. expression vectors containing polynucleotides encoding hSBP polypeptides can 
be designed with signal sequences that direct secretion of hSBP through a prokaryotic or 
eukaryoiic cell membrane. 

Recombinant hSBP constructs can also include a nucleotide sequence(s) encoding one or 
more polypeptide domains that, when expressed in-frame with the hSBP-encoding sequence, 
20 facilitates purification of soluble proteins (Kroll DJ et al (1993) DNA Cell Biol 12:441-53: c.f 
discussion of vectors infra containing fusion proteins). Such purification facilitating domains 
include, but are not limited to, metal chelating peptides (e.g., histidine-tryptophan modules) that 
allow purification with immobilized metals, protein A domains that allow purification with 
immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity 
25 purification system (Immunex Corp, Seattle WA). A cleavable linker sequences(s) (e.g.. Factor 
XA or enterokinase (Invitrogen, San Diego C A)) between the purification domain and the hSBP 
polypeptide-encoding sequence can be included to facilitate purification. One such expression 
vector provides for expression of a fusion protein compromising 6 histidine residues followed by 
thioredoxin and an enterokinase cleavage site. The histidine residues facilitate purification on 
3 0 IMI AC (immobilized metal ion affinity chromatography as described in Porath et al ( 1 992) 

Protein Expression and Purification 3: 263-281), while the enterokinase cleavage site provides a 
means for separating the hSBP domain from the remainder of the fusion protein. 
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hSBP polypeptides (which polypeptides encompass polypeptides composed of a portion 
of the native hSB P amino acid sequence) can also be produced by direct peptide synthesis using 
solid-phase techniques (cf Stewart et al (1969) Solid-Phase Peptide Synthesis . WH Freeman Co, 
San Francisco; Merrifield J (1963) J Am Chem Soc 85:2149-2154). la yUro protein synthesis 
5 can be performed using manual techniques or by automation. Automated synthesis can be 
achieved by, for example, using Applied Biosystems 431 A Peptide Synthesizer (Perkin Elmer, 
Foster City CA) in accordance with the instructions provided by the manufacturer. Various 
fragments of hSBP can be chemically synthesized separately and combined using chemical 
methods to produce the full length molecule. 
10 UsesofhSBP 

The rationale for use of the nucleotide and polypeptide sequences disclosed herein is 
based in part on the differential expression of hSBP-encoding sequences in breast tumor tissue 
and in part on the chemical and structural homology between the hSBP proteins disclosed herein 
and chemical and structural homology between: 1) hSBPl. rat prostatic binding proteins CI 

15 (GI 206442; Delaey et al. supra), rat prostatic binding protein C2(Delaey et al. 1987 Nucl Acid 
Res 15:1627-1641) and rabbit uteroglobin (Menne et al. 1982 Proc Natl Acad Sci USA 79:4853- 
4857) (Figure 5), and 2) hSBP2, human mammaglobin (GI 1 199595; Watson et al. supra): and rat 
prostatic binding protein C3 (GI 206543; Parker et al. supra) (Figure 6). 

Accordingly, hSBP or an hSBP derivative can be used in the diagnosis and management 

2 0 of breast cancer. Given the homology of hSBP with rat PBP. and the differential expression of 
hSBP in human breast tumor tissue, hSBP can be used as a diagnostic marker for human breast 
cancer. Expression of rat PBP is regulated by androgens (Muder et al. 1984 Biochem Biophys 
Acta 781 :121-9; Page et al. 1983 Cell 32:495-502) and by grov^h hormone (Reiter et al. 1995 
Endocrinol 166: 3338-44). Thus the level of hSBP can serve as a marker for transformation of 

25 normal breast cells into cancerous cells. Altematively, or in addition, development of breast 
cancer can be detected by examining the ratio of hSBP to the levels of steroid hormones (e.g., 
testosterone or estrogen) or to other hormones (e.g., growth hormone, insulin). Thus expression 
of hSBPl and/or hSBP2 can also be used to discriminate between normal and cancerous breast 
tissue, to discriminate between different types of breast cancer, to provide guidance in selection 

30 of anti-cancer therapies, to monitor the progress of patients undergoing chemotherapy and/or 

other anti-cancer treatments, to determine the success of surgery to remove cancerous tissue, and 
to monitor patients who have had or are susceptible to breast cancer. In addition to diagnosis and 
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treatment of breast cancer after its development, detection of hSBP expression can be used to 
identify patients susceptible to breast cancer. Expression of hSBP in cancerous cells can be 
examined in breast tissue in situ or in pathology sections. Alternatively, if hSBP is secreted at 
sufficient levels, expression of hSBP can be assessed in blood, serum, or plasma. Assessment of 
5 levels of hSBP expression can be used to differentiate between normal and cancerous breast 

tissue, and/or different types of cancerous breast tissue (e.g., invasive vs. non-invasive; ductal vs, 
axillary lymph node). In addition, because hSBP is differentially expressed in breast tumor cells, 
hSBP polypeptides can serve as a target for anti-cancer therapy that is targeted to hSBP- 
expressing breast tumor cells. For example, cells can be transfected with antisense sequences to 
10 hSBP-encoding polynucleotides or provided with antagonists to hSBP to reduce or eliminate 
hSBP expression in cancerous breast cells. Alternatively, cancerous breast cells, or breast cells 
susceptible to cancer, can be transformed (e.g., via gene therapy techniques) with hSBP-encoding 
nucleic acid to provide for expression of excess hSBP and interruption of steroid binding. 
hSBP Antibodies 

15 hSBP-specific antibodies are useful for the diagnosis of conditions and diseases 

associated with expression of hSBP. Such antibodies include, but are not limited to, polyclonal, 
monoclonal, chimeric, single chain, Fab fragments and fragments produced by a Fab expression 
library. Neutralizing antibodies, i.e., those which inhibit a biochemical activity of hSBP, are • 
especially preferred for diagnostics and therapeutics. 

20 hSBP polypeptides suitable for production of antibodies need not be biologically active; 

rather, the polypeptide, or oligopeptide need only be antigenic. Polypeptides used to generate 
hSBP-specific antibodies generally have an amino acid sequence consisting of at least five amino 
acids, preferably at least 10 amino acids. Preferably, antigenic hSBP polypeptides mimic an 
epitope of the native hSBP. Antibodies specific for short hSBP polypeptides can be generated by 

25 linking the hSBP polypeptide to a carrier, or fusing the hSBP polypeptide to another protein (e.g., 
keyhole limpet hemocyanin), and using the carrier-linked or hSBP chimeric molecule as an 
antigen. In general, anti-hSBP antibodies can be produced according to methods well known in 
the art. 

Various hosts, generally mammalian hosts, can be used to produce anti-hSBP antibodies 
30 (e.g., goats, rabbits, rats, mice). Anti-hSBP antibodies are produced by inmiunizing the host 
(e.g., by injection) with an hSBP polypeptide that retains immunogenic properties (which 
encompasses any portion of native hSBP. fragment or oligopeptide). Depending on the host 
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species, various adjuvants can be used to increase the host's immunological response. Such 
adjuvants include but are not limited to, Freund's, mineral gels (e.g., aluminum hydroxide), and 
surface active substances such as lysolecithin, pluronic polyols, polyanions. peptides, oil 
emulsions, keyhole limpet hemocyanin, and dinitrophenol. BCG (bacilli Calmette-Guerin) and 
Cnrvnebacterium parvum are potentially useful human adjuvants. 

Monoclonal anti-hSBP antibodies can be prepared using any technique that provides for 
the production of antibody molecules by immortalized cell lines in culture. These techniques 
include, but are not limited to. the hybridoma technique originally described by Koehler and 
Milstein (1975 Nature 256:495-497), the human B-cell hybridoma technique (Kosbor et al (1983) 
Immunol Today 4:72; Cote et al (1983) Proc Natl Acad Sci 80:2026-2030) and the 
EBV-hybridoma technique (Cole et al (1985) Monoclonal Antibodies mid Career Therapy, Alan 
R Liss Inc, New York NY, pp 77-96). 

In addition, techniques developed for the production of "chimeric antibodies", the splicing 
of mouse antibody genes to human antibody genes to obtain a molecule with appropriate antigen 
specificity and biological activity can be used (Morrison et al (1984) Proc Natl Acad Sci 
81 :6851-6855; Neuberger et al (1984) Nature 312:604-608; Takeda et al (1985) Nature 
314:452-454). Alternatively, techniques described for the production of single chain antibodies 
(U.S. Patent No. 4,946 J78) can be adapted to produce hSBP-specific single chain antibodies 

Antibodies can be produced in vivo or by screening recombinant immunoglobulin 
libraries or panels of highly specific binding reagents as disclosed in Orlandi et al (1989, Proc 
Natl Acad Sci 86: 3833-3837), and Winter G and Milstein C (1991; Nature 349:293-299). 

Antibody fragments having specific binding sites for an hSBP polypeptide can also be 
generated. For example, such fragments include, but are not limited to, F(ab')2 fragments, which 
can be produced by pepsin digestion of the antibody molecule, and Fab fragments, which can be 
generated by reducing the disulfide bridges of the F(ab*)2 fragments. Alternatively, Fab 
expression libraries can be constructed to allow rapid and easy identification of monoclonal Fab 
fragments with the desired specificity (Huse WD et al (1989) Science 256: 1275-1281). 

A variety of protocols for competitive binding or immunoradiometric assays using either 
polyclonal or monoclonal antibodies having established antigen specificities are well known in 
the art. Such immunoassays typically involve the formation of complexes between an hSBP 
polypeptide and a specific anti-hSBP antibody, and the detection and quantitation of hSBP- 
antibody complex formation. A two-site, monoclonal-based immunoassay utilizing monoclonal 



-23- 



wo 98/21331 



PCT/US97/20674 



antibodies reactive to two noninterfering epitopes on a specific hSBP protein is preferred, but a 
competitive binding assay can also be employed. These assays are described in Maddox DE et al 
(1983, J Exp Med 158:1211). 
Diagnostic Assays Using hSBP Specific Antibodies 
5 Particular hSBP antibodies are useful for the diagnosis of conditions or diseases 

characterized by expression of hSBP (e.g., breast cancer) or in assays to monitor patients being 
treated with hSBP, agonists, antagonists, or inhibitors. Diagnostic assays for hSBP include 
methods using a detectably-labeled anti-hSBP antibody to detect hSBP in human body fluids or 
extracts of cells or tissues. The polypeptides and antibodies of the present invention can be used 

10 with or without modification. Frequently, the polypeptides and antibodies are labeled by 
covalent or noncovalent attachment to a reporter molecule. A wide variety of such suitable 
reporter molecules are known in the art. 

A variety of protocols for detection and quantifying hSBP, using either polyclonal or 
monoclonal antibodies specific for an hSBP polypeptide, are known in the art. Examples include 

15 enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA) and fluorescent 

activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing monoclonal 
antibodies reactive to two non-interfering epitopes on hSBP is preferred, but a competitive 
binding assay can instead be employed. These assays are described, among other places, in 
Maddox. DE et al (1983. J Exp Med 158: 121 1). 

20 In order to provide a basis for diagnosis, normal or standard values for hSBP expression 

must be established. This is accomplished by combining body fluids or cell extracts taken from 
normal subjects, either animal or human, preferably human, with antibody to hSBP under 
conditions suitable for complex formation according to methods well known in the art. The 
amount of standard complex formation can be quantified by comparing detection levels 

2 5 associated with known quantities of hSBP with detection levels associated with both control and 

disease samples from biopsied tissues. Standard values obtained from normal samples are 
compared with values obtained from samples from subjects potentially affected by disease. 
Deviation between standard and subject values establishes the presence of disease state. 
Drug Screening 

3 0 hSBP polypeptides, which encompass biologically active or immunogenic fragments or 

oligopeptides thereof, can be used for screening therapeutic compounds in any of a variety of 
drug screening techniques. The polypeptide employed in such a test can be free in solution. 
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affixed to a solid support, borne on a cell surface, or located intracellularly. The formation of 
binding complexes, between hSBP and the agent being tested, can be measured. 

Preferably, the drug screening technique used provides for high throughput screening of 
compounds having suitable binding affinity to the hSBP. as described in detail in ''Determination 
5 of Amino Acid Sequence Antigenicity" by Geysen HN. WO Application 84/03564, published on 
September 13, 1984, and incorporated herein by reference. In summary, large numbers of 
different small peptide test compounds are synthesized on a solid substrate, such as plastic pins or 
some other surface. The peptide test compoiinds are reacted with hSBP polypeptides, unreacted 
materials are washed away^ and bound hSBP is detected by methods well known in the art. 
10 Purified hSBP can also be coated directly onto plates for use in the aforementioned drug 
screening techniques. Alternatively, non-neutralizing antibodies can be used to capture the 
polypeptide and immobilize it on a solid support. 

The invention also contemplates the use of competitive drug screening assays in which 
hSBP-specific neutralizing antibodies compete with a test compound for binding of hSBP 
15 polypeptide. In this manner, the antibodies can be used to detect the presence of any polypeptide 
that shares one or more antigenic determinants with an hSBP polypeptide. 
Uses of the Polynucleotide Encoding hSBP 

A polynucleotide encoding an hSBP polypeptide (which polypeptides include native 
hSBP and fragments thereoO can be used for diagnostic and/or therapeutic purposes. For 

2 0 diagnostic purposes, polynucleotides encoding hSBP of this invention can be used to detect and 

quantitate gene expression in biopsied tissues in which expression of hSBP is implicated, 
particularly in diagnosis of breast cancer. The diagnostic assay is useful to assess hSBP 
expression levels (e.g., to distinguish between the absence, and presence or hSBP expression, as 
well as to assess various hSBP expression levels (e.g., excessively high, high, moderate, or low)) 
25 and to monitor regulation of hSBP levels during therapeutic intervention. Included in the scope 
of the invention are oligonucleotide sequences, antisense RNA and DNA molecules, and peptide 
nucleic acids (PNAs). 

Another aspect of the subject invention is to provide for hybridization or PCR probes 
capable of detecting polynucleotide sequences encoding hSBP, including genomic sequences and 

3 0 closely related molecules. The specificity of the probe, whether it is made from a highly specific 

region, e.g., 10 unique nucleotides in the 5' regulatory region, or a less specific region, e.g., 
especially in the 3' region, and the stringency of the hybridization or amplification (maximal. 
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high, intermediate or low) will determine whether the probe identifies only naturally occurring 
sequences encoding hSBP, alleles or related sequences. 

The probes of the invention can be used in the detection of related sequences; such probes 
preferably comprise at least 50% of the nucleotides from any of the hSBP polypeptide-encoding 
5 sequences described herein. The hybridization probes of the subject invention can be derived 
from the nucleotide sequence of SEQ ID N0:2 and SEQ ID N0:4. or from their corresponding 
genomic sequences including promoters, enhancer elements and introns of the naturally occurring 
hSBP-encoding sequences. Hybindization probes can be detectably labeled with a variety of 
reporter molecules, including radionuclides (e.g., 32? or 35S), or enzymatic labels (e.g., alkaline 

10 phosphatase coupled to the probe via avidin/biotin coupling systems), and the like. 

Specific hybridization probes for hSBP-encoding DNAs can also be produced by cloning 
nucleic acid sequences encoding hSBP or hSBP derivatives into vectors for production of 
mRNA probes. Such vectors, which are known in the art and are commercially available, can be 
used to synthesize RN A probes in vitro using an appropriate RNA polymerase (e.g, T7 or SP6 

15 RNA polymerase) and appropriate radioactively labeled nucleotides. 
Diagnostic Use 

Polynucleotide sequences encoding hSBP polypeptide can be used in the diagnosis of 
conditions or diseases associated with hSBP expression, especially breast cancer. For example, 
polynucleotide sequences encoding hSBP can be used in hybridization or PCR assays of fluids or 

2 0 tissues from biopsies to detect hSBP expression. Suitable qualitative or quantitative methods 

include Southern or northern analysis, dot blot or other membrane-based technologies: PCR 
technologies: dip slick, pIN, chip and ELISA technologies. All of these techniques are well 
known in the art and are the basis of many commercially available diagnostic kits. 

The nucleotide sequences encoding hSBP disclosed herein provide the basis for assays 
25 that detect the onset of, susceptibility to. or the presence of breast cancer. Nucleotide sequences 
encoding hSBP polypeptides can be labeled by methods known in the art and combined with a 
fluid or tissue sample from a patient suspected of having or susceptible to breast cancer under 
conditions suitable for the formation of hybridization complexes. After an incubation period, the 
sample is washed with a compatible fluid which optionally contains a dye (or other label 

3 0 requiring a developer) if the nucleotide has been labeled with an enzyme. After the compatible 

fluid is rinsed off. the dye is quantitated and compared with a standard. If the amount of dye in 
the biopsied or extracted sample is significantly elevated over that of a comparable negative 
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control sample, the nucleotide sequence has hybridized with nucleotide sequences in the sample. 
The presence of hSBP-encoding nucleotide sequences in the sample, particularly the presence of 
elevated levels of hSBP-encoding sequences, indicates that the patient has or is at risk of 
developing the associated disease. 
5 Such assays can also be used to evaluate the efficacy of a particular therapeutic treatment 

regime in animal studies or in clinical trials, or in monitoring the treatment of an individual 
patient. In order to provide a basis for the diagnosis of disease, a normal or standard profile for 
hSBP expression must be established. This is accomplished by combining body fluids or cell 
extracts taken from normal subjects, either animal or human, with hSBP, or a portion thereof, 

1 0 under conditions suitable for hybridization or amplification. Standard hybridization can be 

quantified by comparing, in the same experiment, the values obtained for normal subjects with 
those obtained with a dilution series of hSBP containing known eimounts of substantially 
purified hSBP. Standard values obtained from normal samples are compared with values 
obtained from samples from patients afflicted with hSBP-associated diseases, or suspected of 

15 having such diseases (e.g., breast cancer). Deviation between standard and subject values is used 
to establish the presence of disease. 

Once disease is established, a therapeutic agent is administered and a treatment profile is 
generated. Such assays can be repeated on a regular basis to evaluate whether the values in the 
profile progress toward or return to a normal or standard pattern of hSBP expression. Successive 

20 treatment profiles can be used to show the efficacy of treatment over a period of several days or 
several months. 

Oligonucleotides based upon hSBP sequences can be used in PCR-based techniques, as 
described in U.S. Patent Nos. 4,683,195 and 4,965,188. Such oligomers are generally chemically 
synthesized, or produced enzymatically or by recombinantly. Oligomers generally comprise two 
25 nucleotide sequences, one with sense orientation (5'->3') and one with antisense (3'<-5'), 

employed under optimized conditions for identification of a specific gene or condition. The same 
two oligomers, nested sets of oligomers, or even a degenerate pool of oligomers can be employed 
under less stringent conditions for detection and/or quantitation of closely related DNA or RNA 
sequences. 

3 0 Additional methods for quantitation of expression of a particular molecule according to 

the invention include radiolabeling (Melby PC et al 1993 J Immunol Methods 159:235-44) or 
biotinylating (Duplaa C et al 1993 Anal Biochem 229-36) nucleotides, coamplification of a 
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control nucleic acid, and interpolation of experimental results according to standard curves. 
Quantitation of multiple samples can be made more time efficient by running the assay in an 
ELISA format in which the oligomer of interest is presented in various dilutions and rapid 
quantitation is accomplished by spectrophotometric or colorimetric detection. For example, the 
5 presence of a relatively high amount of hSBP in extracts of biopsied tissues indicates the 
presence of cancerous breast cells. A definitive diagnosis of this type can allov/ health 
professionals to begin aggressive treatment and prevent further worsening of the condition. 
Similarly, further assays can be used to monitor the progress of a patient during treatment. 
Furthermore, the nucleotide sequences disclosed herein can be used in molecular biology 
10 techniques that have not yet been developed, provided the new techniques rely on properties of 
nucleotide sequences that are currently knovm such as the triplet genetic code, specific base pair 
interactions, and the like. 
Therapeutic Use 

Based upon its homology to genes encoding prostatic binding proteins, hSBP 
15 polypeptides and its expression profile in breast tumor cells, polynucleotide sequences encoding 
hSBP disclosed herein may be useful in the treatment of conditions such as breast cancer or other 
condition associated with hSBP expression or over-expression. 

Expression vectors derived from retroviruses, adenovirus, herpes or vaccinia viruses, or 
from various bacterial plasmids, can be used for delivery of nucleotide sequences to the targeted 
20 organ, tissue or cell population. Recombinant vectors for expression of antisense hSBP 

polynucleotides can be constructed according to methods well known in the art (see, for example, 
the techniques described in Sambrook et al (supra) and Ausubel et ai (supra)). 

Polynucleotides comprising the full length cDN A sequence and/or its regulatory 
elements enable researchers to use sequences encoding hSBP as an investigative tool in sense 
25 (Youssoufian H and HF Lodish 1993 Mol Cell Biol 13:98-104) or antisense (Eguchi et al (1991) 
Ann Rev Biochem 60:631-652) regulation of gene function. Such technology is now well known 

» 

in the art, and sense or antisense oligomers, or larger fragments, can be designed from various 
locations along the coding or control regions. 

Expression of genes encoding hSBP can be decreased by transfecting a cell or tissue with 
3 0 expression vectors that express high levels of a desired hSBP-encoding fragment. Such 

constructs can flood cells with untranslatable sense or antisense sequences. Even in the absence 
of integration into the DNA, such vectors can continue to transcribe RNA molecules until all 
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copies are disabled by endogenous nucleases. Transient expression can last for a month or more 
with a non-replicating vector (Mettler I, personal communication) and even longer if appropriate 
replication elements are part of the vector system. 

As mentioned above, modifications of gene expression can be obtained by designing 
5 antisense molecules, DNA, RNA or PNA, to the control regions of gene encoding hSBP (i.e., the 
promoters, enhancers, and introns). Oligonucleotides derived from the transcription initiation 
site, e.g., between -10 and +10 regions of the leader sequence, are preferred. The antisense 
molecules can also be designed to block translation of mRNA by preventing the transcript from 
binding to ribosomes. Similarly, inhibition of expression can be achieved using "triple helix" 

10 base-pairing methodology. Triple helix pairing compromises the ability of the double helix to 
open sufficiently for binding of polymerases, transcription factors, or regulatory molecules. 
Recent therapeutic advances using triplex DNA were reviewed by Gee JE et al (In: Huber BE and 
BI Carr ( 1 994) Molecular and Immunologic A pproaches . Futura Publishing Co, Mt Kisco NY). 
Ribozymes are enzymatic RNA molecules capable of catalyzing the specific cleavage of 

15 RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the 

ribozyme molecule to complementary target RNA, followed by endonucleolytic cleavage. The 
invention contemplates engineered hammerhead motif ribozyme molecules that can specifically 
and efficiently catalyze endonucleolytic cleavage of sequences encoding hSBP. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified 

20 by scanning the target molecule for ribozyme cleavage sites, which sites include the following 
sequences, GUA, GUU and GUC. Once identified, short RNA sequences between 15 and 20 
ribonucleotides corresponding to a region of the target gene containing the cleavage site can be 
evaluated for secondary structural features that can render the oligonucleotide inoperable. The 
suitability of candidate targets can also be evaluated by testing accessibility to hybridization with 

2 5 complementary oligonucleotides using ribonuclease protection assays. 

Antisense molecules and ribozymes of the invention can be prepared by methods known 
in the art for the synthesis of RNA molecules, including techniques for chemical oligonucleotide 
synthesis, e.g., solid phase phosphoramidite chemical synthesis. Alternatively, RNA molecules 
can be generated by in vitro and in vivo transcription of DNA sequences encoding hSBP. Such 

3 0 DNA sequences can be incorporated into a wide variety of vectors with suitable RNA polymerase 

promoters (e.g, T7 or SP6). Alternatively, antisense cDNA constructs useful in the constitutive 
or inducible synthesis of antisense RNA can be introduced into cell lines, cells, or tissues. 
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RNA molecules can be modified to increase intracellular stability and half-life. Possible 
modifications include, but are not limited to, the addition of flanking sequences at the 5* and/or 3' 
ends of the molecule, or the use of phosphorothioate or 2' 0-methyl rather than 
phosphodiesterase linkages within the backbone of the molecule. This concept is inherent in the 
5 production of PNAs and can be extended in all of these molecules by the inclusion of 

nontraditional bases such as inosine, queosine and wybutosine as well as acetyl-, methyl-, thio- 
and similarly modified forms of adenine, cytidine, guanine, thymine, and uridine that are not as 
easily recognized by endogenous endonucleases. 

Methods for introducing vectors into cells or tissues include those methods discussed 

10 infra and which are equally suitable for in vivo , in vitro and e?^ vivo therapy. In ex vivo therapy, 
vectors are introduced into stem cells obtained from the patient and clonally propagated for 
autologous transplant back into that same patient (see, e.g., U.S. Patent Nos. 5,399,493 and 
5,437.994. incorporated herein by reference). Transfection and by liposome methods for delivery 
of a nucleotide sequence of interest to accomplish gene therapy are well known in the art. 

15 Furthermore, the nucleotide sequences for hSBP disclosed herein can be used in 

molecular biology techniques that have not yet been developed, provided the new techniques rely 
on properties of nucleotide sequences that are currently known, including but not limited to such 
properties as the triplet genetic code and specific base pair interactions. 
Detection and Mapping of Related Polynucleotide Sequences 

2 0 The hSBP nucleic acid sequences can also be used to generate hybridization probes for 

mapping the naturally occurring genomic sequence. The sequence can be mapped to a particular 
chromosome or to a specific region of the chromosome using well known techniques. These 
include in situ hybridization to chromosomal spreads, flow-sorted chromosomal preparations, or 
artificial chromosome constructions such as yeast artificial chronlosomes, bacterial artificial 
25 chromosomes, bacterial PI constructions or single chromosome cDNA libraries as reviewed in 
Price CM (1993; Blood Rev 7:127-34) and Trask BJ (1991; Trends Genet 7:149-54). 

The technique of fluorescent in sifii hybridization of chromosome spreads is described in, 
for example, Vermaetal (1988") Human Chromosomes : A Manual of Basic Techniques . 
Pergamon Press. New York NY. Fluorescent in silu hybridization of chromosomal preparations 

3 0 and other physical chromosome mapping techniques can be correlated with additional genetic 

map data. Examples of genetic map data can be found in the 1994 Genome Issue of Science 
(265: 19810- Correlation between the location of a gene encoding hSBP on a physical 
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chromosomal map and a specific disease (or predisposition to a specific disease) can help delimit 
the region of DNA associated with that genetic disease. The nucleotide sequences of the subject 
invention can be used to detect differences in gene sequences between normal, carrier, or affected 
individuals. 

5 in silu hybridization of chromosomal preparations and physical mapping techniques such 

as linkage analysis using established chromosomal markers can be used for extending genetic 
maps. For example an sequence tagged site based map of the human genome was recently 
published by the Whitehead-MIT Center for Genomic Research (Hudson TJ et al (1995) Science 
270: 1 945- 1 954). Often the placement of a gene on the chromosome of another mammalian 

10 species such as a mouse (Whitehead Institute/MIT Center for Genome Research, Genetic Map of 
the Mouse, Database Release 10, April 28, 1995) can reveal associated markers even if the 
number or arm of a particular human chromosome is not known. New sequences can be assigned 
to chromosomal arms, or parts thereof, by physical mapping. Physical mapping provides 
valuable information to investigators searching for disease genes using positional cloning or other 

15 gene discovery techniques. Once a disease or syndrome, such as ataxia telangiectasia (AT), has 
been crudely localized by genetic linkage to a particular genomic region, for example, AT to 
I lq22-23 (Gatti et al (1988) Nature 336:577-580), other sequences mapping to that area may 
represent associated or regulatory genes for further investigation. The nucleotide sequence of the 
subject invention can also be used to detect differences in the chromosomal location due to 

20 translocation, inversion, etc. among normal, carrier or affected individuals. 
Pharmaceutical Compositions 

The present invention relates to pharmaceutical compositions which can comprise 
nucleotides, proteins, antibodies, agonists, antagonists, or inhibitors, alone or in combination 
with at least one other agent, such as a stabilizing compound, which can be administered in any 

25 sterile, biocompatible pharmaceutical carrier, including, but not limited to, saline, buffered saline, 
dextrose, and water. Any of these molecules can be administered to a patient alone or in 
combination with other agents, drugs or hormones, in pharmaceutical compositions where it is 
mixed with excipient(s), or with pharmaceutically acceptable carriers. In one embodiment of the 
present invention, the pharmaceutically acceptable carrier is pharmaceutically inert. 

3 0 Administration of Pharmaceutical Compositions 

Administration of pharmaceutical compositions is accomplished orally or parenterally. 
Methods of parenteral delivery include topical, intra-arterial (e.g., directly to the breast tumor). 
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intramuscular, subcutaneous, intramedullary, intrathecal, intraventricular, intravenous, 
intraperitoneal, or intranasal administration. In addition to the active ingredients, these 
pharmaceutical compositions can contain suitable pharmaceutical ly acceptable carriers 
comprising excipients and auxiliaries that facilitate processing of the active compounds into 
5 preparations for pharmaceutical use. Further details on techniques for formulation and 
administration can be found in the latest edition of "Remington's Pharmaceutical Sciences" 
(Maack Publishing Co, Easlon PA). 

Pharmaceutical compositions for oral administration can be formulated using 
pharmaceutically acceptable carriers well known in the art in dosages suitable for oral 
10 administration. Such carriers enable the pharmaceutical compositions to be formulated as tablets, 
pills, dragees. capsules, liquids, gels, syrups, slurries, suspensions and the like, for ingestion by 
the patient. 

Pharmaceutical preparations for oral use can be obtained through combination of active 
compounds with solid excipient, optionally grinding a resulting mixture, and processing the 

15 mixture of granules, after adding suitable auxiliaries if desired, to obtain tablets or dragee cores. 
Suitable excipients are carbohydrate or protein fillers such as sugars, including lactose, sucrose, 
mannitol, or sorbitol; starch from com, wheat, rice, potato, or other plants; cellulose such as 
methyl cellulose, hydroxypropylmethyl-cellulose, or sodium carboxymethylcellulose; and gums 
including arabic and tragacanth: and proteins such as gelatin and collagen. If desired, 

20 disintegrating or solubilizing agents can be added, such as the cross-linked polyvinyl pyrrolidone, 
agar, alginic acid, or a salt thereof, such as sodium alginate. 

Dragee cores are provided with suitable coatings such as concentrated sugar solutions, 
which can also contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel, polyethylene glycol, 
and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 

25 Dyestuffs or pigments can be added to the tablets or dragee coatings for product identification or 
to characterize the quantity of active compound, i.e., dosage. 

Pharmaceutical preparations that can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a coating such as glycerol or sorbitol. 
Push-fit capsules can contain active ingredients mixed with a filler or binders such as lactose or 

3 0 starches, lubricants such as talc or magnesium stearate, and, optionally, stabilizers. In soft 

capsules, the active compounds can be dissolved or suspended in suitable liquids, such as fatty 
oils, liquid paraffin, or liquid polyethylene glycol with or without stabilizers. 
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Pharmaceutical formulations for parenteral administration include aqueous solutions of 
active compounds. For injection, the pharmaceutical compositions of the invention can be 
formulated in aqueous solutions, preferably in physiologically compatible buffers such as 
Hanks*s solution. Ringer's solution, or physiologically buffered saline. Aqueous injection 
5 suspensions can contain substances that increase the viscosity of the suspension, such as sodium 
carboxymethyl cellulose, sorbitol, or dextran. Additionally, suspensions of the active compounds 
can be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles 
include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides; or liposomes. Optionally, the suspension can also contain suitable stabilizers or 

10 agents that increase the solubility of the compounds to allow for the preparation of highly 
concentrated solutions. 

For topical or nasal administration, penetrants appropriate to the particular barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art. 
Manufacture and Storage 

15 The pharmaceutical compositions of the present invention can be manufactured in any 

suitable maimer knovm in the art. e.g., by means of conventional mixing, dissolving, granulating, 
dragee-making, levigating, emulsifying, encapsulating, entrapping or lyophilizing processes. 

The pharmaceutical composition can be provided as a salt and can .be formed with many 
acids, including but not limited to hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, 

20 etc. Salts tend to be more soluble in aqueous or other protonic solvents that are the 

corresponding free base forms. In other cases, the preferred preparation can be a lyophilized 
powder in lmM-50 mM histidine, 0.l%-2% sucrose, 2%-7% mannitol at a pH range of 4.5 to 5.5 
that is combined with buffer prior to use. 

After pharmaceutical compositions comprising a compound of the invention formulated 

25 in a acceptable carrier have been prepared, they can be placed in an appropriate container and 

labeled for treatment of an indicated condition. For administration of hSBP, such labeling would 
include amount, frequency and method of administration. 
Therapeutically Effective Dose 

Pharmaceutical compositions suitable for use in the present invention include 

3 0 compositions wherein the active ingredients are contained in an effective amount to achieve the 
intended purpose. The determination of an effective dose is well within the capability of those 
skilled in the art. 
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For any compound, the therapeutically effective dose can be estimated initially either in 
cell culture assays, e.g., of neoplastic cells, or in animal models, usually mice, rabbits, dogs, or 
pigs. The animal model is also used to achieve a desirable concentration range and route of 
administration. Such information can then be used to determine useful doses and routes for 
5 administration in humans. 

A therapeutically effective dose refers to that amount of protein or its antibodies, 
antagonists, or inhibitors that ameliorate the symptoms or condition. Therapeutic efficacy and 
toxicity of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., ED50 (the dose therapeutically effective in 50% of the 
10 population) and LD50 (the dose lethal to 50% of the population). The dose ratio between 
therapeutic and toxic effects is the therapeutic index, and expressed as the ratio LD50/ED50. 
Pharmaceutical compositions that exhibit large therapeutic indices are preferred. The data 
obtained from cell culture assays and animal studies is used in formulating a range of dosage for 
himian use. The dosage of such compounds lies preferably within a range of circulating 
15 concentrations that include the ED50 with little or no toxicity. The actual dosage can vary within 
this range depending upon, for example, the dosage form employed, sensitivity of the patient, and 
the route of administration. 

The exact dosage is chosen by the individual physician in view of the patient to be 
treated. Dosage and administration are adjusted to provide sufficient levels of the active moiety 
2 0 or to maintain the desired effect. Additional factors that may be taken into account include the 
severity of the disease state, e.g., tumor size and location; age, weight and gender of the patient; 
diet, time and frequency of administration: drug combination(s); reaction sensitivities; and 
tolerance/response to therapy. Long-acting pharmaceutical compositions can be administered 
every 3 to 4 days, every week, or once every two weeks depending on half-life and clearance rate 
25 of the particular formulation. 

Normal dosage amounts may vary from 0.1 to 100,000 micrograms, up to a total dose of 
. about 1 g, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art. 
Those skilled in the art will employ different formulations for nucleotides than for proteins or 
30 their inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to 
particular cells, conditions, locations, etc. 

It is contemplated, for example, that hSBP or an hSBP derivative can be delivered in a 
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suitable formulation to block the progression of breast cancer Similarly, administration of hSBP 
antagonists may also inhibit the activity or shorten the lifespan of this protein. 

The examples below are provided to illustrate the subject invention and are not included 
for the purpose of limiting the invention. 
5 INDUSTRIAL APPLICABILITY 

I. Construction of BRSTTUTOl cDNA Libraries 

The BRSTTUTOl cDNA library was constructed from breast tumor removed from a 55 
year old female (lot #0005: Mayo Clinic. Rochester MN). The frozen tissue was immediately 
homogenized and lysed using a Brinkmann Homogenizer Polytron-PT 3000 (Brinkmami 

10 Instruments, Inc. Westbury NY) in guanidinium isothiocyanate solution. Lysates were then 
loaded on a 5.7 M CsCl cushion and ultracentrifuged in a SW28 swinging bucket rotor for 18 
hours at 25.000 rpm at ambient temperature. The UNA was extracted once with acid phenol at 
pH 4.0 and once with phenol chloroform at pH 8.0 and precipitated using 0.3 M sodium acetate 
and 2.5 volumes of ethanol. resuspended in DEPC-treated water and DNase treated for 25 min at 

15 37°. The reaction was stopped with an equal volume of acid phenol, and the RNA was isolated 
using the Qiagen Oligotex kit (QIAGEN Inc, Chaisworth CA) and used to construct the cDNA 
library. The RNA was handled according to the recommended protocols in the Superscript 
Plasmid System for cDNA Synthesis and Plasmid Cloning (catalog #18248-013; Gibco/BRL). 
cDNAs were fractionated on a Sepharose CL4B column (catalog #275105, Pharmacia), and those 

20 cDNAs exceeding 400 bp were ligated into pSpori L The plasmid pSport 1 was subsequently 
transformed into DH5a(tm) competent cells (Cat. #18258-012, Gibco/BRL). 

II. Isolation and Sequencing of cDNA Clones From BRSTTUTOl 

Plasmid DNA was released fronii the cells and purified using the Miniprep Kit (Catalogue 
# 77468; Advanced Genetic Technologies Corporation, Gaithersburg MD). This kit consists of a 

25 96 well block with reagents for 960 purifications. The recommended protocol was employed 
except for the following changes: 1) the 96 wells were each filled with only 1 ml of sterile 
Terrific Broth (Catalog # 22711 , LIFE TECHNOLOGIES(tm), Gaithersburg MD) with 
carbenicillin at 25 mg/L and glycerol at 0.4%; 2) the bacteria were cultured for 24 hours after the 
wells were inoculated and then lysed with 60 |il of lysis buffer; 3) a centrifugation step 

30 employing the Beckman GS-6R @2900 rpm for 5 min was performed before the contents of the 
block were added to the primary filter plate; and 4) the optional step of adding isopropanol to 
TRIS buffer was not routinely performed. After the last step in the protocol, samples were 
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transferred to a Beckman 96-well block for storage. 

The cDNAs were sequenced by the method of Sanger F and AR Coulson (1975; J Mol 
Biol 94:441 f), using a Hamilton Micro Lab 2200 (Hamilton. Reno NV) in combination with four 
Peltier Thermal Cyclers (PTC200 from MJ Research, Watertown MA) and Applied Biosystems 
5 377 or 373 DNA Sequencing Systems (Perkin Elmer), and reading frame was determined. 
III. Homolog}' Searching of cDN A Clones and Their DeducedProteins 

Each cDNA was compared to sequences in GenBank using a search algorithm developed 
by Applied Biosystems and incorporated into the INHERIT™ 670 Sequence Analysis System. In 
this algorithm. Panem Specification Language (TRW Inc, Los Angeles CA) was used to 

10 determine regions of homology. The three parameters that determine how the sequence 

comparisons run were window size, window offset, and error tolerance. Using a combination of 
these three parameters, the DNA database was searched for sequences containing regions of 
homology to the query sequence, and the appropriate sequences were scored with an initial value. 
Subsequently, these homologous regions were examined using dot matrix homology plots to 

15 distinguish regions of homology from chance matches. Smith- Waterman alignments were used 
to display the results of the homology search. 

Peptide and protein sequence homologies were ascertained using the INHERIT- 670 
Sequence Analysis System in a way similar to that used in DNA sequence homologies. Pattern 
Specification Language and parameter windows were used to search protein databases for 

20 sequences containing regions of homology which were scored with an initial value. Dot-matrix 
homology plots were examined to distinguish regions of significant homology from chance 
matches. 

BLAST, which stands for Basic Local Alignment Search Tool (Altschul SF (1993) J Mol 
Evol 36:290-300; Altschul, SF et al (1990) J Mol Biol 215:403-10), was used to search for local 

25 sequence alignments. BLAST produces alignments of both nucleotide and amino acid sequences 
to determine sequence similarity. Because of the local nature of the alignments, BLAST is 
especially useful in determining exact matches or in identifying homologs. BLAST is useful for 
matches that do not contain gaps. The fundamental unit of BLAST algorithm output is the 
High-scoring Segment Pair (HSP). 

30 An HSP consists of two sequence fragments of arbitrary but equal lengths whose 

alignment is locally maximal and for which the alignment score meets or exceeds a threshold or 
cutoff score set by the user. The BLAST approach identifies HSPs between a query sequence 
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and a database sequence, evaluates the statistical significance of any matches found, and reports 
only those matches which satisfy the user-selected threshold of significance. The parameter E 
establishes the statistically significant threshold for reporting database sequence matches. E is 
interpreted as the upper bound of the expected frequency of chance occurrence of an HSP (or set 
5 of HSPs) within the context of the entire database search. Any database sequence whose match 
satisfies E is reported in the program output. 

IV. Northern Analysis 

Northern analysis, a laboratory technique used to detect the presence of a gene transcript, 
and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs 
10 from a particular cell type or tissue have been bound (Sambrook et al. supra). 

Analogous computer techniques using BLAST (Altschul SF 1993 and 1990, supra) are 
used to search for identical or related molecules in nucleotide databases such as GenBank or the 
LIFESEQ™ database (Incyte, Palo Alto GA). This analysis is much faster than multiple, 
membrane-based hybridizations. In addition, the sensitivity of the computer search can be 
15 modified to determine whether any particular match is categorized as exact or homologous. 
The basis of the search is the product score which is defined as: 

% sequence identity x % maximum BLAST score 
•100 

The product score takes into account both the degree of similarity between two sequences and the 
20 length of the sequence match. For example, with a product score of 40, the match vs^ll be exact 
within a 1-2% error; and at 70, the match will be exact. Homologous molecules are usually 
identified by selecting those which show product scores between 15 and 40, although lower 
scores can identify related molecules. The abundance data (Abun) represent the number of 
transcripts of the gene of interest in the cDNA library. Percent abundance is calculated by 
25 dividing the number of transcripts of a gene of interest present in a cDNA library by the total 
number of transcripts in the cDNA library. 

V. Extension of hSBP-Encoding Polynucleotides to FuULength or to Recover 
Regulatory Elements 

Full length hSBP-encoding nucleic acid sequences (SEQ ID N0:2, SEQ ID N0:4, or SEQ 
30 ID N0:6) are used to design oligonucleotide primers for extending a partial nucleotide sequence 
to fiill length and/or for obtaining 5' sequences from genomic libraries. One synthesized primer 
is used to initiate extension in the antisense direction (XLR), and a second synthesized primer is 
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used to extend sequence in the sense direction (XLF). Primers allow the extension of the known 
hSBP-encoding sequence "outward" generating ampHcons containing new. unknown nucleotide 
sequence for the region of interest (U.S. Patent Application 08/487,1 12, filed June 7, 1995, 
specifically incorporated by reference). The initial primers are designed from the cDNA using 
5 OLIGO® 4.06 Primer Analysis Software (National Biosciences), or another appropriate program. 
The initial primers are preferable designed to be 22-30 nucleotides in length, have a GC content 
of 50% or more, and to anneal to the target sequence at temperatures about 68°-72° C. Any 
stretch of nucleotides that would result in hairpin structures and primer-primer dimerizations is 
avoided. 

10 The original, selected cDNA libraries, or a human genomic library, are used to extend the 

sequence; the latter is most useful to obtain 5* upstream regions. If more extension is necessary 
or desired, additional sets of primers are designed to further extend the known region. 

By following the instructions for the XL-PCR kit (Perkin Elmer) and thoroughly mixing 
the enzyme and reaction mix. high fidelity amplification is obtained. Beginning with 40 pmol of 

15 each primer and the recommended concentrations of all other components of the kit, PGR is 
performed using the Peltier Thermal Cycler (PTC200; MJ Research, Watertown MA) and the 
following parameters: 

Step 1 94° C for I min (initial denaturation) 

Step 2 65° C fori min 

20 Step 3 68° C for 6 min 

Step 4 94° C for 15 sec 

Step 5 65° C for I min 

Step 6 68° C for 7 min 

Step 7 Repeat step 4-6 for 1 5 additional cycles 

25 Step 8 94° C for 15 sec 

Step 9 65° C fori min 

• Step 10 68° C for 7:15 min 

Step 1 1 Repeat step 8-10 for 12 cycles 

Step 12 72° C for 8 min 

3 0 Step 13 4 ° C (and holding) 

A 5-10 ul aliquot of the reaction mixture is analyzed by electrophoresis on a low 
concentration (about 0.6-0.8%) agarose mini-gel to determine which reactions were successful in 
extending the sequence. Bands containing the largest products were selected and cut out of the 
3 5 gel. Further purification is accomplished using a commercial gel extraction method such as 
QIAQuick™ (QIAGEN Inc). After recovery of the DNA, Klenow enzyme was used to trim 
single-stranded, nucleotide overhangs creating blunt ends to facilitate religation and cloning. 
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After ethanol precipitation, the products are redissolved in 13/^1 of ligation buffer, 1 
T4-DNA ligase (15 units) and 1 ^1 T4 polynucleotide kinase are added, and the mixture is 
incubated at room temperature for 2-3 hours or overnight at 16** C. Competent Ex cqU cells (in 
40 /il of appropriate media) are transformed with 3 fxl of ligation mixture and cultured in 80 of 
SOC medium (Sambrook J et al, supra). After incubation for one hour at 37° C, the whole 
transformation mixture is plated on Luria Bertani (LB)-agar (Sambrook J et al, supra) containing 
2xCarb. The following day, several colonies are randomly picked from each plate and cultured in 
150 /zl of liquid LB/2xCarb medium placed in an individual well of an appropriate, 
commercially-available, sterile 96- well microtiter plate. The following day, 5 /zl of each 
overnight culture is transferred into a non-sterile 96-well plate and after dilution 1:10 with water, 
5 m1 of each sample was transferred into a PGR array. 

For PGR amplification, 18 ul of concentrated PGR reaction mix (3.3x) containing 4 units 
of rTth DNA polymerase, a vector primer and one or both of the gene specific primers used for 
the extension reaction were added to each well. Amplification was performed using the 
following conditions: 



Step 1 94° G for 60 sec 

Step 2 94° C for 20 sec 

Step 3 55° G for 30 sec 

Step 4 72° G for 90 sec 

Step 5 Repeat steps 2-4 for an additional 29 
cycles 

Step 6 72° G for 180 sec 

Step 7 4° G (and holding) 



Aliquots of the PGR reactions are run on agarose gels together with molecular weight 
markers. The sizes of the PGR products were compared to the original partial cDNAs. and 
appropriate clones were selected, iigated into plasmid and sequenced. 
VL Labeling and Use of Hybridizatioo Probes 

Hybridization probes derived from SEQ ID N0:2 and SEQ ID N0:4 are used to screen 
cDNAs, genomic DNAs or mRNAs. Although the labeling of oligonucleotides, consisting of 
about 20 base-pairs, is specifically described, essentially the same procedure is used with larger 
cDNA fragments. Oligonucleotides are designed using state-of-the-art software such as OLIGO 
4.06 (National Biosciences), labeled by combining 50 pmol of each oligomer and 250 mCi of 
[Y-^"P] adenosine triphosphate (Amersham, Ghicago IL) and T4 polynucleotide kinase (DuPont 
NEN*, Boston MA). The labeled oligonucleotides are substantially purified with Sephadex G-25 

-39- 



wo 98/21331 PCT/US97/20674 

super fine resin column (Pharmacia). A portion containing 10' counts per minute of each of the 
sense and antisense oligonucleotides is used in a typical membrane based hybridization analysis 
of human genomic DNA digested with one of the following endonucleases (Ase I, Bgl II, Eco RI, 
Pst I Xba L or Pvu II; DuPont NEN*). 
5 The DNA from each digest is fractionated on a 0.7 percent agarose gel and transferred to 

nylon membranes (Nytran Plus, Schleicher & SchuelK Durham NH). Hybridization is carried out 
for 16 hours at 40°C. To remove nonspecific signals, blots are sequentially washed at room 
temperature under increasingly stringent conditions up to 0.1 x saline sodium citrate and 0.5% 
sodium dodecyl sulfate. After XOMAT AR™ film (Kodak. Rochester NY) is exposed to the 

10 blots in a Phosphoimager cassette (Molecular Dynamics. Sunnyvale CA) for several hours, 
hybridization patterns are compared visually. 
VIL Antisense Molecules 

An hSBP polypeptide-encoding sequence (which sequences encompass full length and 
partial hSBP sequences), is used to inhibit in vivo or in vitro expression of naturally occurring 

15 hSBP. Although use of antisense oligonucleotides, comprising about 20 base-pairs, is 

specifically described, essentially the same procedure is used with larger cDNA fragments. An 
oHgonucleotide based on the coding sequences of hSBP. as shown in Figures 1 A and IB and 2 A 
and 2B is used to inhibit expression of naturally occurring hSBP. The complementary 
oligonucleotide is designed from the most unique 5* sequence as shown in Figures 1 A and IB and 

20 2 A and 2B and used either to inhibit transcription by preventing promoter binding to the 

upstream nontranslated sequence or translation of an hSBP-encoding transcript by preventing the 
ribosome from binding. Using an appropriate portion of the leader and 5' sequence of SEQ ID 
-N0:2 or SEQ ID N0:4, an effective antisense oligonucleotide includes any 15-20 nucleotides 
spanning the region which translates into the signal or early coding sequence of the polypeptide 

25 as shown in Figures 1 A and 1 B, and 2A and 2B. 
VIIL Expression of hSBP 

Expression of the hSBP is accomplished by subcloning the cDNAs into appropriate 
vectors and transfecting the vectors into host cells. In this case, the cloning vector. pSport, 
previously used for the generation of the cDNA library is used to express hSBP polypeptides in 

30 £. £oli. The pSport vector contains a promoter for B-galactosidase upstream of the cloning site, 
followed by a sequence encoding the amino-terminal Met and the subsequent 7 residues of 
B-galactosidase. Sequences encoding a bacteriophage promoter useful for transcription and a 
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linker containing a number of unique restriction sites are positioned immediately after the eight 
li-galactosidase residue-encoding sequences. 

IPTG is used to induce production of the fusion protein in an isolated, transfected 
bacterial strain according to standard methods. The ftision protein comprises the first seven 
5 residues of B-galactosidase, about 5 to 15 residues of linker, and the full length hSBP-encoding 
sequence. The signal sequence directs the secretion of hSBP polypeptide into the bacterial 
growth media, which can then be used directly in the following activity assay. 

IX. hSBP Activity 

Given the homology of hSBP with rat prostatic binding protein (rPBP), human 
10 mammaglobin. rabbit uteroglobin, and FHG 22. activity of hSBP can be assessed by the ability of 
the polypeptide to bind to steroid. Methods for assessing steroid binding to a polypeptide are 
well known in the art (see. e.g., Heyns et al. 1977 Eur J Biochem 78:221-230). Alternatively, 
given the homology between hSBP and rPBP, and the similarities between rPBP and estramucine 
binding protein (EMBP), hSBP activity can be assessed by the ability of hSBP to bind 
15, estrmucine. Methods for assessing estramucine binding are well known in the art (see, e.g., 
Appelgren et al. 1979 Acta Pharmacol Toxicol 43:368-374; Forsgren et al. 1979 Cancer Res 
39:5155-5164: Hoisaeter et al. 1981 J Steroid Biochem 14:251-160). 

X. Production of hSBP Specific Antibodies 

hSBP polypeptide substantially purified using PAGE electrophoresis (SambrooL supra) 
20 is used to immunize rabbits and to produce antibodies using standard protocols. The amino acid 
sequence translated from hSBP is analyzed using DNAStar software (DNAStar Inc) to determine 
regions of high immunogenicity, and a corresponding oligopolypeptide is synthesized and used to 
produce antibodies according to methods known to those of skill in the art. Analysis to select 
appropriate epitopes, such as those near the C-terminus or in hydrophilic regions is described by 

2 5 Ausubel et al (supra). 

Typically, antibodies are generated using polypeptides about 15 residues in length, 
which are synthesized on an Applied Biosystems Peptide Synthesizer Model 431 A using fmoc- 
chemistr\'. and coupled to keyhole limpet hemocyanin (KLH, Sigma) by reaction with M- 
maleimidobenzoyl-N-hydroxysuccinimide ester (MBS; Ausubel et al, supra). Rabbits are 

3 0 immunized with the polypeptide-KLH complex in complete Freund's adjuvant. The resulting 

antisera are tested for anti-polypeptide activity by, for example, binding the peptide to plastic, 
blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radioiodinated. 
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goat anti-rabbit IgG. 

XI. Purification of Naturally Occurring hSBP Using Specific Antibodies 

Naturally-occurring or recombinant hSBP is substantially purified by immunoaffinity 
chromatography using antibodies specific for hSBP. An immunoaffmity column is constructed 
5 by covalently coupling anti-hSBP antibody to an activated chromatographic resin such as 

CnBr-aciivated Sepharose (Pharmacia Biotech). After coupling, the resin is blocked and washed 
according to the manufacturer's instructions. 

Media containing hSBP polypeptide is passed over the immunoaffinity column, and the 
column is washed under conditions that allow the preferential absorbance of hSBP (e.g., high 

10 ionic strength buffers in the presence of detergent). The column is eluted under conditions that 
disrupt antibody-hSBP binding (e.g., a buffer of pH 2-3 or a high concentration of a chaotrope 
such as urea or thiocyanate ion), and hSBP polypeptide is collected. 
XIL Identification of Molecules Which Interact with HSBP 

hSBP polypeptides, especially biologically active hSBP polypeptides, are labeled with 

15 '--I Bolton-Hunter reagent (Bolton and Hunter (1973) Biochem. J 133:529). Candidate molecules 
previously arrayed in the wells of a 96 well plate are incubated with the labeled hSBP 
polypeptides, washed, and assayed for labeled hSBP complex. Data obtained using different 
concentrations of hSBP are used to calculate values for the number, affinity, and association of 
hSBP with the candidate molecules. 

20 All publications and patents mentioned in the above specification are herein incorporated 

by reference. Various modifications and variations of the described method and system of the 
invention will be apparent to those skilled in the art without departing from the scope and spirit 
of the invention. Although the invention has been described in connection with specific preferred 
embodiments, it should be understood that the invention as claimed should not be unduly limited 

25 to such specific embodiments. Indeed, various modifications of the described modes for carrying 
out the invention which are obvious to those skilled in molecular biology or related fields are 
intended to be within the scope of the following claims. 

Before the present nucleotide and polypeptide sequences are described, it is to be 
understood that this invention is not limited to the particular methodology, protocols, cell lines. 

3 0 vectors and reagents described as such may, of course, vary. It is also to be understood that the 
terminology used herein is for the purpose of describing particular embodiments only, and is not 
intended to limit the scope of the present invention which will be limited only by the appended 
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claims. 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"and", and "the" include plural referents unless the context clearly dictates otherwise. Thus, for 
example, reference to "a host cell" includes a plurality of such host cells and reference to "the 
antibody" includes reference to one or more antibodies and equivalents thereof known to those 
skilled in the art, and so forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meaning as commonly understood to one of ordinary skill in the art to which this invention 
belongs. Although any methods, devices and materials similar or equivalent to those described 
herein can be used in the practice or testing of the invention, the preferred methods, devices and 
materials are now described. 

All publications mentioned herein are incorporated herein by reference for the purpose of 
describing and disclosing the cell lines, vectors, and methodologies which are described in the 
publications which might be used in connection with the presently described invention. The 
publications discussed herein are provided solely for their disclosure prior to the filing date of the 
present application. Nothing herein is to be construed as an admission that the inventors are not 
entitled to antedate such disclosure by virtue of prior invention. 
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SEQUENCE LISTING 



(1) GENERAL :::eormat:on : 



(i) APPLICANT: IMCYTE PHARMACEUTICALS, INC. 



(ii) TITLE OF INVENTION: BREAST TUMOR SPECIFIC PROTEINS 
(iii) NUMBER OF SEQUENCES: 13 

(iv) CORRSSPONDEMCE ADDRESS: 

(A) ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

(B) STREET: 3174 Porter Drive 

(C) CITY: Palo Alto 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 94304 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 
(C; OPERATING SYSTEM: DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) PCT APPLICATION NUMBER: To Be Assigned 

(B) FILING DATE: Herewith 
{C; CLASSIFICATION: 

(vii) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/747,547 

(B) FILING DATE: 12-NOV-1996 

(viii) ' ATTORNEY/ AGENT INFORMATION: 
;a: NAME: Billings, Lucy J. 
(3; REGISTRATION NUMBER: 36,749 
;C; REFERE::CE/ DOCKET number: ?r-0C77 PCT 

(ix) TELECOMMUNICATION INFORMATION; 

(A) TELEPHONE: (650) 855-0555 
(3i TELEFAJC: (650) 845-4166. 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 90 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 : 

Met Lys Leu Ser Val Cys Leu Leu Leu Val Thr Leu Ala Leu Cys Cys 
1 5 10 15 
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Tyr Gin Ala Asn Ala.Giu Phe Cys 
20 

AsD Phe Phe Phe lie Ser Glu Pro 
35 40 

Phe AsD Ala Pro Pro Glu Ala Val 
50" 55 

Cys Thr Asd Gin Met Ser Leu Gin 
65 * 70 

Leu Val Lys lie Leu Lys Lys Cys 
85 



Pro Ala Leu Val Ser Glu Leu Leu 
25 30 

Leu Phe Lys Leu Ser Leu Ala Lys 
45 

Ala Ala Lys Leu Gly Vai Lys Arg 
60 

Lys Arg Ser Leu lie Ala Glu Val 
75 80 

Ser Val 
90 



(2) INF0Rr4AT:0N FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A; LENGTH: 4 05 base pairs 
(3) TYPE: nucleic acid 
(C; STRANDEDNESS: double 
(D; TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUHINCE DESCRIPTION: SEQ ID M0:2: 

GTCCAAATCA CTCATTGTTT GTGAAAGCTG AGCTCACAGC AAAACAAGCC ACC .-.TG 56 

Met 
1 



AAG 


CTG 


TCG 


GTG 


TGT 


CTC 


CTG 


CTG 


GTC 


ACG 


CTG 


GCC 


CTC 


ToC 


TGC 


TAC 


104 


Lys 


Leu 


Ser 


Val 


Cys 


Leu 


Leu 


Leu 


Val 


Thr 


Leu 


Ala 


Leu 


Cys 


Cys 


Tyr 










5 










iO 










15 








CAG 


GCC 


AAT 


GCC 


GAG 


TTC 


TGC 


CCA 


GCT 


CTT 


GTT 


TCT 


GAG 


CTG 


TTA 


GAC 


152 


Gin 


Ala 


Asn 


Ala 


Glu 


Phe 


Cys 


Pro 


Ala 


Leu 


Val 


Ser 


Giu 


Leu 


Leu 


Asp 








20 










25 










30 








TTC 




TTC 


ACT 


AGT 


GAA 


CCT 


CTG 


TTC 


AAG 


TTA 


AGT 


CTT 


GCC 


AAA 


TTT 


200 


Phe 


Phe 


Phe 


lie 


Ser 


Glu 


Pro 


Leu 


Phe 


Lys 


Leu 


Ser 


Leu 


Ala 


Lys 


Phe 






35 










40 










45 












GAT 


GCC 


CCT 


CCG 


GAA 


GCT 


GTT 


GCA 


GCC 


AAG 


TTA 


GGA 


GTG 


AAG 


AGA 


TGC 


248 


Asp 


Ala 


Pro 


Fro 


Glu 


Ala 


Val 


Ala 


Ala 


Lys 


Leu 


Gly 


Val 


Lys 


Arg 


Cys 




50 










55 










60 










65 




ACG 


GAT 


CAG 


ATG 


TCC 


CTT 


CAG 


AAA 


CGA 


AGC 


CTC 


ATT 


GCG 


GAA 


GTC 


CTG 


296 


Thr 


Asp Gin 


Met 


Ser 


Leu 


Gin 


Lys 


Arg 


Ser 


Leu 


He 


Ala 


Glu 


Val 


Leu 












70 










75 










80 






GTG 


AAA 


ATA 


TIG 


AAG 


AAA 


TGT 


AGT 


GTG 


TGA 


CATGTAAAAA CTTTCATCCT 


346 


Val 


Lys 


He 


Leu 


Lys 


Lys 


Cys 


Ser 


Val 


it 






















35 










90 



















GGTTTCCACT GTCTTTCAAT GACACCCTGA TCTTCACTGC AGAATGTAAA GGTTTCAAC 405 



45 



wo 98/21331 PCT/US97/20674 



(2) INFORMATION FOR S£Q ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

:A) LENGTH: 93 amino acids 
:3) TYPE: anino acid 
:C) STRANDEDNESS : double 
■D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

Met Lys Leu Leu Met Val Leu Met Leu Ala Ala Leu Ser Gin His Cys 
1*5 10 15 

Tyr Ala Gly Ser oly Cys Pro Leu Leu Glu Asn Val lie Ser Lys Thr 
20 25 30 

lie Asn Pro Gin Val Ser Lys Thr Giu Tyr Lys Glu Leu Leu Gin Glu 
35 40 45 

Phe lie Asp AsD Asn Ala Thr Thr Asn Ala He Asp Glu Leu Lys Glu 
50 55 60 

Cys ?r.e Leu Asn Gin Thr Asp Glu Thr Leu Ser Asn Val Glu Val Phe 

65 70 75 80 

Met Gin Leu lie Tyr Asp Ser Ser Leu Cys Asp Leu Phe 
85 90 



(2) INF0Rt<.=iTI0N FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

:A) LENGTH: 495 base pairs 
:31 TYPE: nucleic acid 
;c; STRANDEDNESS: double 
:Z) TOPOLOGY: linear 

(ii) r':OLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GATCCTTGCC ACCCGCGACT GAACACCGAC AGCAGCAGCC TCACC ATG AAG TTG 5.4 

Met Lys Leu 

CTG ATG GTC CTC ATG CTG GCG GCC CTC TCC CAG CAC TGC TAC GCA GGC 102 
Leu Met Val Leu Met Leu Ala Ala Leu Ser Gin His Cys Tyr Ala Gly 
5 10 15 

TCT GGC TGC CCC TTA TTG GAG AAT GTG ATT TCC AAG ACA ATC AAT CCA 150 
Ser Gly Cys Pro Leu Leu Glu Asn Val lie Ser Lys Thr He Asn Pro 
20 25 30 35 

CAA GTG TCT AAG ACT GAA TAC AAA GAA CTT CTT CAA GAG TTC ATA GAC 198 
Gin Val Ser Lys Thr Glu Tyr Lys Glu Leu Leu Gin Glu Phe lie Asd 
40 45 50 
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GAC AAT GCC ACT ACA AAT GCC ATA GAT GAA TTG AAG GAA TGT TTT CTT 24 6 

Asp Asn Ala Thr Thr Asn Ala lie Asp Glu Leu hys Glu Cys Phe L u 
55 60 65 

AAC CAA ACG G.-.T GAA ACT CTG AGC AAT GTT GAG GTG TTT ATG CAA TTA 294 
P.sn Gin Thr Asp Glu* Thr Leu Ser Asn Val Glu Val Phe Met Gin Leu 
70 " 75 80 

ATA TAT GAC A3C AGT CTT TGT GAT TTA TTT TAA CTT TCT GCA AGA CCT 342 
lie Tyr Asd Ser Ser Leu Cys Asp Leu Phe * 
85 ' 90 * 

TTG GCT CAC AGA ACT GCA GGG TAT GGT GAG AAA CCA ACT ACG GAT TGC 390 

TGC AAA CCA CAC CTT CTC TTT CTT ATG TCT TTT TAC TAC AAA CTA CAA 4 38 

GAC AAT TGT TGA AAC CTG CTA TAC ATG TTT ATT TTA ATA AAT TGA TGG 486 

CAA AAA CTG 4 95 



:2) INFORMATICN FOR S£Q ID NO: 5: 

(i) SEQUIMCE CHARACTERISTICS: 

(A) LENGTH: 111 amino acids 
(3; TYPE: an\ino acid 
(C) STRANDEDNESS: double 
TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQ-JENCE DESCRIPTION: SEQ ID NO: 5: 

Met Ser Thr lie Lys Leu Ser Leu Cys Leu Leu lie Met Leu Ala Val 

1 5 ' 10 15 

Cys Cys 7yr Glu Ala Asn Ala Ser Gin lie Cys Glu Leu Val Ala His 

20 25 30 

Glu Thr lie Ser Phe Leu Met Lys Ser Glu Glu Glu Leu Lys Lys Glu 
35 40 45 

Leu Glu Met Tyr Asn Ala Pro Pro Ala Ala Val Glu Ala Lys Leu Glu 

50 55 60 

Val Lys Arg Cys Val Asp Gin Met Ser Asn Gly Asp Arg Leu Val Val 
65 70 75 80 

Ala Glu Thr Leu Val Tyr He Phe Leu Glu Cys Gly Val Lys Gin Trp 
85 90 95 

Val Glu Thr Tyr Tyr Pro Glu He Asp Phe Tyr Tyr Asp Met Asn 
100 105 110 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 412 base pairs 
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;b) TYPE: nucleic acid 
CO STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CGCTAAGTAG AAAACTGAA ATG AGC ACC ATT AAG CTG AGC CTG TGT CTT CTG 52 

Met Ser Thr He Lys Leu Ser Leu Cys Leu Leu 
15 10 

ATC ATG CTG GCT GTT TGT TGC TAT GAA GOT AAT GCT AGC CAG ATC TGT ' 100 
lie Met Leu Ala Val Cys Cys Tyr Glu Ala Asn Ala Ser Gin He Cys 
15 20 25 

GAA CTT GTT GCC CAT GAA ACC ATA AGC TTC TTA ATG AAA AGT GAG GAA 14 8 

Glu Leu Val Ala His Glu Thr He Ser Phe Leu Met Lys Ser Glu Glu 
30 35 40 

GAA CTG AAG AAG GAA CTT GAG ATG TAT AAT GCA CCT CCA GCA GCT GTT 196 
Glu Leu Lys Lys Glu Leu Glu Met Tvr Asn Ala Pro Pro Ala Ala Vai 
46 50 55 

GAA GCA AAA CTG GAA GTG AAG AGA TGT GTA GAC CAG ATG AGC AAT GGA 24 4 

Glu Ala Lys Leu Glu Vai Lys Arg Cys Val Asp Gin Met Ser Asn Gly 
60 65 70 75 

GAC AGA TTG GTA GTA GCA GAA ACA CTG GTA TAC ATT TTT TTG GAA TGT 292 
Asp Arg Leu Val Val Ala Glu Thr Leu Val Tyr lie Phe Leu Glu Cys 

80 85 90 

GGT GTG AAA CAA TGG GTA GAA ACA TAT TAT CCT GAG ATC GAT TTC TAC 340 
Gly Val Lys Gin Trp Val Glu Thr Tyr Tyr Pro Glu He Aso Phe Tyr 

95 100 105 

TAC GAT ATG AAC TGA TTT TTC CTG TTC AAT GTG ATG GTT TCA AGT CTT . 388 
Tyr AsD-Met: Asn * 
110 

GCA CCA ATA AAT TAT TCT CCT TGC 412 



{2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 40 base pairs 

(B) TYPE: nucleic acid 
;C) STRANDEDNESS: double 
;D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 



GCTCATCCTT TGCTAAGTCT GAAAACAAAC TGAGCACCAT GAAGCTGTCC CTGTGTCTTC 60 

TGTTGGTCAT CCTGGCTGTT CATTGCTATG AAGCTAATGC TGCAAACGTC TGTCCAGCAG 120 

TTCTTTCTGT AAGCAAATCT TTCCTATTTG ACAAGGTGGA GAAATTTGAG GCCTATCTTC 180 

48 



wo 98/21331 PCT/US97/20674 

AGACATTTAA CGCACCTCCA GAGGCTGTTA AAGCAAAAGT GGAAGTGAAG AAATGTATAG 240 

ACAGCACTCT GAACTATTTA GAGAAAATGG AAATGGGAAA AATACTGGCA GAAGTCGTTG 300 

GGTATTG7AA AGGAACAGAA AACTGAAACA TGGCTCTTCC TGGTCTCCAT TGCTTCTCAC 360 

AGATAAACTG ACTTTCCTTG CCCAATGTGA AGGTT7CAAC GTCTTGCACT AATAAATTAC 4 20 

TCTCCTTGCA TGTTAAAAAA 440 

(2) INFORMATION FOR SEQ ID N0:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 112 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

;ii) MOLECULE TYPE: protein 

(xi; SSQL-ENCE DESCRIPTION: SEQ ID MO: 8: 

Met Arg Leu Ser Leu Cys Leu Leu Thr lie Leu Val Val Cys Cvs Tyr 
15 10 15 

Glu Ala Asn Gly Gin Thr Leu Ala Gly Gin Val Cys Gin Ala Leu Gin 
20 25 30 

Asp Val Thr lie Thr Phe Leu Leu Asn Pro Glu Glu Glu Leu Lys Arg 
35 40 45 

Glu Leu Giu Glu Phe Asd Ala Pro Pro Glu Ala Val Glu Ala Asn Leu 
50 ' 55 60 

Lys Val Lys Arg Cys He Asn Lys He Met Tyr sly Asp Arg Leu Ser 
65 70 75 ^ 80 

Met Gly Thr Ser Leu Val Phe He Me- Leu Lvs Cvs Asp Val Lvs Val 
85 90 ' ' 95 

Trp Leu Gin lie Asn Phe Pro Arg Glv Arg Tro Phe Ser Glu He Asn 
100 . 105 ' 110 



(2) INFORMATION FOR. SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 91 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Lys Leu Ala He Thr Leu Ala Leu Val Thr Leu Ala Leu Leu Cys 
15 10 15 
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Ser Pro Ala Ser Ala Gly He Cys Pro Arg Phe Ala His Val He Glu 
20 25 30 

Asn Leu Leu Leu Gly Thr Pro Ser Ser Tyr Glu Thr Ser Leu Lys Glu 
55 40 45 

Phe Glu Pro Asp Asd Thr Met Lys Asd Ala Gly Met Gin Met Lys Lys 
50 55 ' 60 

Val Leu Asp Ser Leu Pro Gin Thr Thr Arg Glu Asn He Mez Lys Leu 
65 70 75 80 

Thr Glu Lys He Val Lys Ser Pro Leu Cys Met 
85 90 



;2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 amino acias 

(B) TYPE: amino acid 

(C) STRANDSDME3S: double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 



(xi)- SEQUENCE DESCRIPTION: SEQ ID NOtlO: 

Me- Lys Leu Leu Mer Val Leu Met Leu Ala Ala Leu Ser Glr. His Cys 
15 10 15 

Tyr Ala Gly Ser Gly Cys Pro Leu Leu Glu Asn Val He Ser Lys Thr 
20 25 30 

lie Asn Pro Gin Val Ser Lys Thr Gl'j Tyr Lys Glu Leu Leu Gin Glu 
35 40 45 

Phe lie Asp Asp Asn Ala Thr Thr Asn Ala He Aso Glu Leu Lys Glu 
50 55 6 j" 

Cys Phe Leu Asn Gin Thr Asp Glu Thr Leu Ser Asn Val Glu Val Phe 

65 70 75 80 

Met Gin Leu He Tyr- Asp Ser Ser Leu Cvs Asp Leu Phe 
85 90 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 503 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:11: 
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GACAGCGGCT TCCTTGATCC TTGCCACCCG CGACTG;-J^CA CCGACAGCAG CAGCCTCACC 60 

A7G AAG TTG CTG ATG GTC CTC ATG CTG GCG GCC etc TCC GAG CAC TGC 108 
Met Lys Leu Leu Met Val Leu Met Leu Ala Ala Leu Ser Gin His Cys 
1 5 10 ■ 15 ' 

TAG GCA GGC TCT GGC TGC GCC TTA TTG GAG AAT GIG ATT TCC AAG ACA 156 
Tyr Ala Gly Ser Gly Cys Pro Leu Leu Glu Asn Val lie Ser Lys Thr 
20 25 30 

ATC AAT CCA CAA GTG TCT AAG ACT GAA TAG AAA GAA CTT CTT CAA GAG 204 
lie Asn Pro Gin Val Ser Lys Thr Glu Tyr Lys Glu Leu Leu Gin Glu 
35 40 45 

TTC ATA GAG GAG AAT GCC ACT ACA AAT GCC ATA GAT GAA TTG AAG GAA 252 
?he lie ASD Asp Asn Ala Thr Thr Asn Ala lie Asd Glu Leu Lys Glu 
50 ' 55 60 

TGT TTT CTT AAC CAA ACG GAT GAA ACT CTG AGC AAT GTT GAG GTG TTT 300 
Cys Phe Leu Asn Gin Thr Asp Glu Thr Leu Ser Asn Val Glu Val Phe 
65 70 75 80 

ATG CAA TTA ATA TAT GAC AGC AGT CTT TGT GAT TTA TTT TAA CTT TCT 348 
y.et Gin Leu lie Tyr Asp Ser Ser Leu Cys Asp Leu Phe 
85 90 

GCA AGA CCT TTG GCT CAC AG A ACT GCA GGG TAT GGT GAG AAA CCA ACT 396 

ACG GAT TGC TGC AAA CCA CAC CTT CTC TTT CTT ATG TCT TTT TAC TAC 444 

AAA CTA CAA GAC AAT TGT TGA AAC CTG CTA TAC ATG TTT ATT TTA ATA 4 92 

AAT TGA TGG CA 503 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 95 amino acids 
(B) TYPE: amino acid 
;C) STRANDEDNESS: double 
(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Lys Leu Val Phe Leu Phe Leu Leu Val Thr He Pro He Cys Cys 

15 10 15 

Tyr Ala Ser Gly Ser Gly Cys Ser He Leu Aso Glu Val He Arg Gly 

20 25 30 

Thr He Asn Ser Thr Val Thr Leu His Asp Tyr Met Lys Leu Val Lys 

35 40 45 

Pro Tyr Val Gin Asd His Phe Thr Glu Lys Ala Val Lys Gin Phe Lys 

50 55 60 
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Gin Cys Phe Leu Asp Gin Thr Asp Lys Thr Leu Giu Asn Val Gly Val 

65 ' 70 75 80 

Met Met Glu Ala lie Phe Asn Ser Glu Ser Cys Gin Gin Pro Ser 

35 90 95 



(2) INFORMATION FOR SZQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 509 base pairs 

(B) TYPE: nucleic acid 
;C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

AGTTTCCTGA TTTCTGTCTT GGACAACAGA ACAACCCACA GGGACTGCC? CAAC ATG 57 

Met 
1 



AAG 
Lys 


CTG 
Leu 


GTG 

Vol 


Phe 
5 


CTA 
Leu 


1 

?he 


TTG 
Leu 


TTG 
Leu 


GTC 
Val 
10 


ACC 
Thr 


ATC 
lie 


CCT 
Pro 


ATT 

lie 


TGC 
Cys 
15 


TGC 
Cys 


TAT 
Tyr 


105 


GCC 
Ala 


AGT 
Ser 


GGT 
Gly 
20 


Ser 


GGC T jC 
Gly Cys 


AGT 
Ser 


ATT 
He 
25 


CTA 
Leu 


GAT 
Asp 


GAA 
Glu 


GTT 
Val 


ATT 
He 
30 


AGA 
Arg 


GGT 
Gly 


ACA 
Thr 


153 


ATT 
lie 


AAC 
Asn 
35 


TCA 
Ser 


ACT 
Tr.r 


GTG 
Val 


ACT 


TTA 
Leu 
40 


CAT 

His 


GAC 
Asp 


TAT 
Tyr 


ATG 
Met 


AAA 
Lys 
45 


TTA 
Leu 


GTT 
Val 


AAG 
Lys 


CCA 
Pro 


201 


TAT 
Tyr 

50 


GTA 
Val 


CAA 
Gin 


GAT 
Asp 


CAT 
His 


Phe 

Z 0 


ACT 
Thr 


GAA 
Giu 


AAG 
Lys 


GCT 
Ala 


GTG 
Val 
60 


AAG 
Lys 


CAA 
Gin 


Phe 


AAG 
Lys 


CAG 
Gin 
65 


249 


TGT 
Cys 


Phe 


CTA 
Leu 


GAT 
Asp 


CAG 
Gin 
70 


Thr 


Asp 


AAG 
Lys 


ACT 
Thr 


CTG 
Leu 
75 


GAA 
Glu 


AAT 
Asn 


GTT 
Val 


GGC 
Giy 


GTG 
Val 
80 


ATG 
Met 


297 


ATG 
Met 


GAG 
Glu 


GGA 
Ala 


ATA 
He 
85 


TTT 
Phe 


AAC 
Asn 


AGT 
Ser 


GAA 
Glu 


AGC 
Ser 
90 


TGT 
Cys 


CAA 
Gin 


CAG 
Gin 


CCA 
Pro 


TCC 
Ser 
95 


TAA 


ACA 


345 


TCT 


ACA 


AGA 




TTG 


GCC 


ACA 


GGA 


CTC 


CAG 


GAA 


ACT 


GGC 


AAT 


GGC 


CAA 


393 


GCA 


ACT 


GAT 


AAC 


ACA 


GAT 


CAT 


AAC 


TCT 


TCT 


TTC 


TTG 


AAC 


CCC 


TTT 


TTC 


441 


TAC 


CTA 


TAA 


AGT 


GCA 


AGA 


CGA 


TTG 


TTG 


AAA 


CCT 


CAA 


ATT 


TAT 


GTC 


TTT 


489 


CCA 


TTT 


TAT 


TAA 


ATT 


ATC 


TG 
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CLAIMS 

1 . A substantially purified human steroid binding protein C 1 (hSBPl ) polypeptide 
comprising the amino acid sequence of SEQ ID NO: 1 or fragments thereof. 

2. An isolated and purified polynucleotide sequence encoding an hSBPl polypeptide of 
5 claim 1. 

3. An isolated and purified polynucleotide sequence of claim 2 consisting of SEQ ID N0:2 
or variants thereof 

4. A polynucleotide sequence which is complementary to SEQ ID N0:2 or degenerate 
variants thereof 

10 5. A recombinant expression vector comprising the polynucleotide sequence of claim 2. 

6. A recombinant host cell containing the polynucleotide sequence of claim 5. 

7. A method for producing a polypeptide comprising a polypeptide of SEQ ID NO: 1 . the 
method comprising the steps of: 

a) culturing the host cell of claim 6 under conditions suitable for the expression of the 
15 polypeptide: and 

b) recovering the polypeptide from the host cell culture. 

8. A pharmaceutical composition comprising a substantially purified hSBP polypeptide 
having an amino acid sequence of SEQ ID NO:l in conjunction with a suitable pharmaceutical 
carrier. 

20 9. A purified antibody that specifically binds the polypeptide of claim I , 

10. A purified antagonist which specifically regulates or modulates the activity of the 
polypeptide of claim 1 . 

11. A pharmaceutical composition comprising a substantially purified antagonist of the 
polypeptide of claim 1 in conjunction with a suitable pharmaceutical carrier. 

25 12. A substantially purified human steroid binding protein C2 (hSBP2) polypeptide 
comprising the amino acid sequence of SEQ ID N0:3 or fragments thereof 

13. An isolated and purified polynucleotide sequence encoding an hSBP2 polypeptide of 
claim 12. 

14. An isolated and purified polynucleotide sequence of claim 1 3 consisting of SEQ ID 
3 0 N0:4 or variants thereof 

15. A polynucleotide sequence which is complementary to SEQ ID N0:4 or degenerate 
variants thereof 
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16. A recombinant expression vector comprising the polynucleotide sequence of claim 13. 

17. A recombinant host cell containing the polynucleotide sequence of claim 13. 

1 8. A method for producing a polypeptide comprising a polypeptide of SEQ ID N0:3, the 
method comprising the steps of: 

5 a) culturing the host cell of claim 17 under conditions suitable for the expression of the 

polypeptide: and 

b) recovering the polypeptide from the host cell culture. 

19. A pharmaceutical composition comprising a substantially purified human steroid binding 
protein C2 (hSBP2) polypeptide having an amino acid sequence of SEQ ID N0:3 in conjimction 

1 0 v^ith a suitable pharmaceutical carrier. 

20. A purified antibody that specifically binds the polypeptide of claim 12. 

21. A purified antagonist which specifically regulates or modulates the activity of the 
polypeptide of claim 12. 

22. A pharmaceutical composition comprising a substantially purified antagonist of the 
" 15 polypeptide of claim 12 in conjunction with a suitable pharmaceutical carrier. 
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